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(54) Assessing colorectal cancer 

(57) A method of assessing the presence or ab- 
sence of colorectal cancer or the likely condition of a 
person believed to have colorectal cancer is conducted 
by analyzing the expression of a group of genes. Gene 



expression profiles in a variety of medium such as 
microarrays are included as are kits that contain them. 
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Description 
BACKGROUND 

5 [0001] This application claims the benefit of U.S Provisional Application No,60/368,798 filed on March 29, 2002. 
[0002] This invention relates to diagnostics and prognostics for colorectal cancer based on the gene expression 
profiles of biological samples. 

[0003] Colorectal cancer is a heterogenous disease, consisting of tumors thought to emerge through three major 
molecular mechanisms: 1 ) mutations in the adenomatous polyposis coli (APC) gene, or the p*catenin gene, combined 

10 vi^ith chromosomal instability, 2) mutations in DNA mismatch repair genes, such as MLH1 , MSH2, PMS1, PMS2 and 
MSH6, associated with microsateltite instability and mutations in genes containing short repeats, and 3) gene silencing 
induced by hypermethylation of the promoter regions of tumor suppressor genes. The genetic complement of individual 
colorectal cancers is likely to include different combinations of genetic instability, specific mutations, and gene silencing. 
Chromosomal instability (CIN) is a common feature of cancers in general. It implies an aneuplold phenotype, in which 

15 whole chromosomes or large parts of them are being lost or gained. Microsomal instability (MIN) is found in diploid 
tumors with an Increased mutation rate in short repeats. Both forms of genetic instability are common in colorectal 
cancer. 

[0004] Colorectal cancers thus have complex origins and involve a number of interactions in different biological 
pathways. Serum marl<ers, histological, and cytologicai examinations historically used to assist in providing diagnostic, 
20 prognostic, or therapy monitoring decisions often do not have desired reliability. Likewise, while use of a single genetic 
marker (e.g., increased expression of a particular gene) may be beneficial, the diversity of the cancers make it more 
likely that a portfolio of genetic markers is the best approach. 

SUMMARY OF THE INVENTION 

25 

[0005] The invention is a method of assessing the presence or absence of colorectal cancer or the likely condition 
of a person believed to have colorectal cancer. In the method, a gene expression profile of a patient sample is analyzed 
to detemnine whether a patient has a colorectal cancer, whether a patient does not have colorectal cancer, whether a 
patient is likely to get colorectal cancer, or the response to treatment of a patient being treated for colorectal cancer. 
30 [0006] Articles used In practising the methods are also an aspect of the Invention. Such articles include gene ex- 
pression profiles or representations of them that are fixed in machine-readable media such as computer readable 
media. 

[0007] Articles used to identify gene expression profiles can also include substrates orsurfaces, such as microarrays, 
to capture and/or indicate the presence, absence, or degree of gene expression. 

35 

DETAILED DESCRIPTION 

[0008] The mere presence or absence of particular nucleic acid sequences in a tissue sample has only rarely been 
found to have diagnostic or prognostic value. Infomiation about the expression of various proteins, peptides or mRNA, 

40 on the other hand, Is increasingly viewed as important. The mere presence of nucleic acid sequences having the 
potential to express proteins, peptides, or mRNA ( such sequences referred to as "genes") within the genome by itself 
is not determinative of whether a protein, peptide, or mRNA is expressed in a given cell. Whether or not a given gene 
capable of expressing proteins, peptides, or mRNA does so and to what extent such expression occurs, if at all, is 
determined by a variety of complex factors. Irrespective of drffbulties in understanding and assessing these factors, 

45 assaying gene expression can provide useful infomiation about the occurrence of Important events such as tumero- 
genesis, metastasis, apoptosis, and other clinically relevant phenomena. Relative indications of the degree to whk:h 
genes are active or inactive can be found in gene expression profiles. The gene expression profiles of this invention 
are used to diagnose and treat patients for colorectal cancer 

[0009] Sample preparation requires the collection of patient samples. Patient samples used in the inventive method 
50 are those that are suspected of containing diseased cells such as epithelial celts taken from a colon sample or from 
surgical margins. One useful technique for obtaining suspect samples Is Laser Capture Microdisection (LCM). LCM 
technology provides a way to select the cells to be studied, minimizing variability caused by cell type heterogeneity. 
Consequently, moderate or small changes In gene expression between nomial and cancerous cells can be readily 
detected. In a prefen-ed method, the samples comprise circulating epithelial cells extracted from peripheral blood. 
55 These can be obtained according to a number of methods but the most prefen'ed method is the magnetic separation 
technique described in U.S. Patent 6,1 36,1 82 assigned to Immunivest Corp which is incorporated herein by reference. 
Once the sample containing the cells of interest has been obtained, RNA is extracted and amplified and a gene ex- 
pression profile is obtained, preferably via micro-array, for genes in the appropriate portfolios. 
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[0010] Preferred methods for establishing gene expression profiles include detennlning the amount of RNA that is 
produced by a gene that can code for a protein or peptide. This Is accomplished by reverse transcriptase PGR 
(RT-PCR). competitive RT-PCR. real time RT-PCR» differential display RT-PCR. Northern Blot analysis and other re- 
lated tests. While it is possible to conduct these techniques using individual PGR reactions, it is best to amplify com- 

5 plimentary DNA (cDNA) or complimentary RNA (cRNA) produced from mRNA and analyze it via microarray. A number 
of different array configurations and methods for their production are known to those of skill in the art and are described 
in U.S. Patents such as: 5.445,934; 5,532,128; 5.556.752; 5,242,974; 5,384,261; 5,405.783; 5,412,087; 5,424.186; 
5,429,807; 5.436.327; 5,472,672; 5,527,681; 5,529,756; 5,545,531; 5,554,501; 5,561,071; 5,571.639; 5,593.839; 
5,599,695; 5,624,711 ; 5,658,734; and 5,700,637; the disclosures of which are Incorporated herein by reference. 

10 [0011] Microarray technology allows for the measurement of the steady-state mRNA level of thousands of genes 
simultaneously thereby presenting a powerful tool for Identifying effects such as the onset, an-est, or modulation of 
uncontrolled cell proliferation. Two microanray technologies are cunrently In wide use. The first are cDNA arrays and 
the second are oligonucleotide an'ays. Although differences exist in the construction of these chips, essentially all 
downstream data analysis and output are the same. The product of these analyses are typically measurements of the 

15 Intensity of the signal received from a labeled probe used to detect a cDNA sequence from the sample that hybridizes 
to a nucleic acid sequence at a known location on the microanray. Typically, the intensity of the signal is proportional 
to the quantity of cDNA, and thus mRNA, expressed in the sample cells. A large number of such techniques are 
available and useful. Prefen-ed methods for determining gene expression can be found in US Patents 6,271,002 to 
Linsley, et al.; 6,218,122 to Friend, et al.; 6,218,114 to Peck, et al.; and 6,004,755 to Wang, et al., the disclosure of 

20 each of which is incorporated herein by reference. 

[0012] Analysis of the expression levels is conducted by comparing such intensities. This is best done by generating 
a ratio matrix of the expression intensities of genes in a test sample versus those in a control sample. For instance, 
the gene expression intensities from a diseased tissue can be compared with the expression intensities generated 
from normal tissue of the same type (e.g., diseased colon tissue sample vs. nomrtal colon tissue sample). A ratio of 

25 these expression intensities indicates the fold-change in gene expression between the test and control samples. 

[001 3] Gene expression profiles can also be displayed in a number of ways. The most common method is to arrange 
a raw fluorescence intensities or ratio matrix into a graphical dendogram where columns Indicate test samples and 
rows indicate genes. The data is arranged so genes that have similar expression profiles are proximal to each other. 
The expression ratio for each gene is visualized as a color. For example, a ratio less than one (indicating down-regu- 

30 lation) may appear in the blue portion of the spectrum while a ratio greater than one (indicating up-regulatlon) may 
appear as a color in the red portion of the spectrum. Commercially available computer software programs are available 
to display such data Including "GENESPRING" from Silicon Genetics, Inc. and "DISCOVERY" and "INFER" software 
from Partek, Inc^ 

[0014] Modulated genes used in the methods of the invention are shown in Table 1 . The genes that are differentially 
35 expressed are shown as being either up regulated or down regulated in diseased cells. Up regulation and down reg- 
ulation are relative terms meaning that a detectable difference (beyond the contribution of noise in the system used to 
measure It) Is found In the amount of expression of the genes relative to some baseline. In this case, the baseline Is 
the measured gene expression of a normal cell. The genes of interest in the diseased cells are then either up regulated 
or down regulated relative to the baseline level using the same measurement method. Diseased, in this context, refers 
40 to an alteration of the state of a body that interrupts or disturbs, or has the potential to disturb, proper perfomnance of 
bodily functions as occurs with the uncontrolled proliferation of cells. Someone is diagnosed with a disease when some 
aspect of that person's genotype or phenotype is consistent with the presence of the disease. However, the act of 
conducting a diagnosis or prognosis includes the determination disease/status issues such as therapy monitoring. In 
therapy monitoring, clinical judgments are made regarding the effect of a given course of therapy by comparing the 
45 expression of genes over time to detemnine whether the gene expression profiles have changed or are changing to 
patterns more consistent with nonrial tissue. 

[0015] Preferably, levels of up and down regulation are distinguished based on fold changes of the intensity meas- 
urements of hybridized microarray probes. A 2.0 fold difference is prefen^ed for making such distinctions or a p-value 
less than .05. That is, before a gene is said to be differentially expressed in diseased versus normal cells, the diseased 

50 cell is found to yield at least 2 more, or 2 times less Intensity than the nomnal celts. The greater the fold difference, the 
more preferred Is use of the gene as a diagnostic. Genes selected for the gene expression profiles of the instant 
invention have expression levels that result in the generation of a signal that is distinguishable from those of the nomrial 
or non-modulated genes by an amount that exceeds background using clinical laboratory instrumentation. 
[0016] Statistical values can be used to confidently distinguish modulated from non-modulated genes and noise. 

55 Statistical tests find the genes most significantly different between diverse groups of samples. The Student's t-test Is 
an example of a robust statistical test that can be used to find significant differences between two groups. The lower 
the p-value, the more compelling the evidence that the gene is showing a difference between the different groups. 
Nevertheless, since microarrays measure more than one gene at a time, tens of thousands of statistical tests may be 
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asked at one time. Because of this, there is likelihood to see small p-values just by chance and adjustments for this 
using a SIdak correction as well as a randomization/permutation experiment can be made. A p-value less than .05 by 
the t-test Is evidence that the gene Is significantly different. More compelling evidence Is a p-value less then .05 after 
the Sidak correct is factored in. For a large number of samples in each group, a p-value less than 0.05 after the 

5 randomization/permutation test Is the most compelling evidence of a significant difference. 

[0017] Another parameter that can be used to select genes that generate a signal that is greater than that of the 
non-modulated gene or noise is the use of a measurement of absolute signal difference. Preferably, the signal generated 
by the modulated gene expression is at least 20% different than those of the nomnal or non-modulated gene (on an 
absolute basis). It Is even more preferred that such genes produce expression patterns that are at least 30% different 

10 than those of nonnal or non-modulated genes. 

[0018] Genes can be grouped so that infonmation obtained about the set of genes in the group provides a sound 
basis for making a clinically relevant judgment such as a diagnosis, prognosis, or treatment choice. These sets of 
genes make up the portfolios of the Invention. In this case, the judgments supported by the portfolios involve colorectal 
cancer. Portfolios of gene expression profiles can be comprised of combinations of genes described in Example 3. As 

'5 with most diagnostic markers, it is often desirable to use the fewest number of markers sufficient to make a correct 
medical judgment. This prevents a delay in treatment pending further analysis as well inappropriate use of time and 
resources, in this case, such a minimal portfolio can be comprised of a combination of genes from Example 4. 
[0019] Preferably, portfolios are established such that the combination of genes In the portfolio exhibit improved 
sensitivity and specificity relative to individual genes or randomly selected combinations of genes. In the context of the 

20 instant invention, the sensitivity of the portfolio can be reflected in the fold differences exhibited by a gene's expression 
in the diseased state relative to the nonmal state. Specificity can be reflected in statistical measurements of the corre- 
lation of the signaling of gene expression with the condition of Interest. For example, standard deviation can be a used 
as such a measurement. In considering a group of genes for Inclusion in a portfolio, a small standard deviation in 
expression measurements correlates with greater specificity. Other measurements of variation such as correlation 

25 coefficients can also be used in this capacity. The most preferred method of establishing gene expression portfolios is 
through the use of optimization algorithms such as the mean variance algorithm widely used In establishing stock 
portfolios. This method is described in detail in the co-pending patent application entitled "Portfolio Selection" by Tim 
Jatkoe, et. al., of equal date hereto. Essentially, the method calls for the establishment of a set of inputs (stocks in 
financial applications, expression as measured by intensity here) that will optimize the retum (e.g., signal that is gen- 

30 erated) one receives for using it while minimizing the variability of the return. Many commercial software programs are 
available to conduct such operations. "Wagner Associates Mean-Variance Optimization Application", referred to as 
"Wagner Software" throughout this specification, is preferred. This software uses functions from the "Wagner Associ- 
ates Mean-Variance Optimization Library" to determine an efficient frontier and optimal portfolios In the Maricowltz 
sense is prefen^ed. 

35 [0020] Use of this type of software requires that microan^ay data be transformed so that It can be treated as an input 
in the way stock return and risk measurements are used when the software is used for its intended financial analysis 
purposes. For example, when Wagner Software Is employed in conjunction with microarray intensity measurements 
the following data transformation method is employed. 

[0021 ] Genes are first pre-selected by identifying those genes whose expression shows at least some minimal level 
40 of differentiation. The preferred pre-selection process Is conducted as follows. A baseline class is selected. Typically, 
this will comprise genes from a population that does not have the condition of interest. For example, If one were Inter- 
ested In selecting a portfolio of genes that are diagnostic for breast cancer, samples from patients without breast cancer 
can be used to make the baseline class. Once the baseline class is selected, the arithmetic mean and standard deviation 
is calculated for the Indicator of gene expression of each gene for baseline class samples. This indicator is typically 
45 the fluorescent Intensity of a microan^ay reading. The statistical data computed is then used to calculate a baseline 
value of (X*Standard Deviation + Mean) for each gene. This is the baseline reading for the gene from which all other 
samples will be compared. X is a stringency variable selected by the person fomnulating the portfolio. Higher values 
of X are more stringent than lower. Preferably, X is in the range of .5 to 3 with 2 to 3 being more prefen^ed and 3 being 
most prefen'ed. 

50 [0022] Ratios between each experimental sample (those displaying the condition of interest) versus baseline read- 
ings are then calculated. The ratios are then transformed to base 10 logarithmic values for ease of data handling by 
the software. This enables down regulated genes to display negative values necessary for optimization according to 
the Markman mean-variance algorithm using the Wagner Software. 

[0023] The preprocessed data comprising these transfonDed ratios are used as inputs in place of the asset return 
55 values that are normally used In the Wagner Software when It Is used for financial analysis purposes. 

[0024] Once an efficient frontier is fomnulated, an optimized portfolio is selected for a given input level (return) or 
variance that corresponds to a point on the frontier. These inputs or variances are the predetemiined standards set by 
the person formulating the portfolio. Stated differently, one seeking the optimum portfolio determines an acceptable 
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input level (indicative of sensitivity) or a given level of variance (Indicative of specificity) and selects the genes that lie 
along the efficient frontier that correspond to that input level or variance. The Wagner Software can select such genes 
when an input level or variance Is selected. It can also assign a weight to each gene in the portfolio as it would for a 
stock in a stock portfolio. 

5 [0025] Determining whether a sample has the condition for which the portfolio is diagnostic can be conducted by 
comparing the expression of the genes in the portfolio for the patient sample with calculated values of differentially 
expressed genes used to establish the portfolio. Preferably, a portfolio value is first generated by summing the multiples 
of the intensity value of each gene in the portfolio by the weight assigned to that gene In the portfolio selection process. 
A boundary value is then calculated by (Y^standard deviation + mean of the portfolio value for baseline groups) where 

10 Y is a stringency value having the same meaning as X described above. A sample having a portfolio value greater 
than the portfolio value of the baseline class Is then classified as having the condition. If desired, this process can be 
conducted iteratively in accordance with well known statistical methods for Improving confidence levels. 
[0026] Optionally one can reiterate this process until best prediction accuracy is obtained. 
[0027] The process of portfolio selection and characterization of an unknown is summarized as follows: 

15 

1 . Choose baseline class. 

2. Calculate mean, and standard deviation of each gene for baseline class samples. 

3. Calculate (X*Standard Deviation + Mean) for each gene. This is the baseline reading from which all other samples 
will be compared. X is a stringency variable with higher values of X being more stringent than lower 

20 4. Calculate ratio between each Experimental sample versus baseline reading calculated in step 3. 

5. Transfomn ratios such that ratios less than 1 are negative (eg.using Log base 10). (Down regulated genes now 
correctly have negative values necessary for MV optimization). 

6. These transfonned ratios are used as Inputs In place of the asset retums that are nonmally used In the software 

application. 

2s 7. The software will plot the efficient frontier and return an optimized portfolio at any point along the efficient frontier. 

8. Choose a desired retum or variance on the efficient frontier. 

9. Calculate the Portfolio's Value for each sample by summing the multiples of each gene's intensity value by the 
weight generated by the portfolio selection algorithm. 

10. Calculate a boundary value by adding the mean Portfolio Value for Baseline groups to the multiple of Y and 
30 the Standard Deviation of the Baseline's Portfolio Values. Values greater than this boundary value shall be classified 

as the Experimental Class. 

11. Optionally one can reiterate this process until best prediction accuracy is obtained. 

[0028] Alternatively, genes can first be pre-selected by Identifying those genes whose expression shows some min- 
35 imal level of differentiation. The pre-selection in this alternative method is preferably based on a threshold given by 



40 



45 



55 



1 < 



where p., is the mean of the subset known to possess the disease or condition, \i„ is the mean of the subset of normal 
samples, and a, + represent the combined standard deviations. A signal to noise cutoff can also be used by pre- 
selecting the data according to a relationship such as 



0.5 < 



(<yr+<'n) 



This ensures that genes that are pre-selected based on their differential modulation are differentiated in a clinically 
50 significant way. That Is, above the noise level of Instrumentation appropriate to the task of measuring the diagnostic 
parameters. For each mari<er pre-selected according to these criteria, a matrix is established in which columns repre- 
sents samples, rows represent markers and each element is a normalized intensity measurement for the expression 
of that marker according to the relationship: 
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where I is the intensity measurement. 

[0029] It is also possible to set additional boundary conditions to define the optimal portfolios. For example, portfolio 
size can be limited to a fixed range or number of markers. This can be done either by making data pre-sefection criteria 
more stringent (e.g, 



.8< 



10 instead of 



15 



0.5^ 



or by using programming features such as restricting portfolio size. One could, for example, set the boundary condition 
that the efficient frontier is to be selected from among only the most optimai 10 genes. One could also use ali of the 
genes pre-selected for detennining the efficient frontier and then limit the number of genes selected (e.g., no more 
than 10). 

20 [0030] The process of selecting a portfolio can also Include the application of heuristic rules. Preferably, such rules 
are fomnulated based on biology and an understanding of the technology used to produce clinical results. More pref- 
erably, they are applied to output from the optimization method. For example, the mean variance method of portfolio 
selection can be applied to microarray data for a number of genes differentially expressed in subjects with breast 
cancer. Output from the method would be an optimized set of genes that could include some genes that are expressed 

25 in peripheral blood as well as in diseased breast tissue. If sample used in the testing method are obtained from pe- 
ripheral blood and certain genes differentially expressed In Instances of breast cancer could also be differentially ex- 
pressed in peripheral blood, then a heuristic rule can be applied in which a portfolio is selected from the efficient frontier 
excluding those that are differentially expressed in peripheral blood. Of course, the rule can be applied prior to the 
fonnation of the efficient frontier by, for example, applying the rule during data pre-selection. 

30 [0031] Other heuristic rules can be applied that are not necessarily related to the biology in question. For example, 
one can apply the rule that only a given percentage of the portfolio can be represented by a particular gene or genes. 
Commercially available software such as the Wagner Software readily accommodates these types of heuristics. This 
can be useful, for example, when factors other than accuracy and precision (e.g., anticipated licensing fees) have an 
impact on the desirability of including one or more genes. 

35 [0032] One method of the invention involves comparing gene expression profiles for various genes (or portfolios) to 
conduct diagnoses as described above. The gene expression profiles of each of the genes comprising the portfolio 
are fixed in a medium such as a computer readable medium. This can take a number of forms. For example, a table 
can be established into which the range of signals (e.g., intensity measurements) indicative of disease is input. Actual 
patient data can then be compared to the values In the table to detemnlne whether the patient samples are normal or 

40 diseased. In a more sophisticated embodiment, patterns of the expression signals (e.g.. flourescent intensity) are 
recorded digitally or graphically. The gene expression pattems from the gene portfolios used in conjunction with patient 
samples are then compared to the expression patterns. Pattem comparison software can then be used to detennine 
whether the patient samples have a pattern indicative of the disease in question. Of course, these comparisons can 
also be used to determine whether the patient results are normal. The expression profiles of the samples are then 

45 compared to the portfolio of a nomnal or control cell. If the sample expression pattems are consistent with the expression 
pattem for a colorectal cancer then (in the absence of countervailing medical considerations) the patient is diagnosed 
as positive for colorectal cancer. If the sample expression patterns are consistent with the expression pattern from the 
normal/control cell then the patient is diagnosed negative for colorectal cancer. 

[0033] Numerous well known methods of pattern recognition are available. The following references provide some 
50 examples: 

Weighted Voting: Golub. TR., Slonim, DK., Tamaya, P., Huard, C, Gaasenbeek, M,, Meslrov, JP., Coller, H., Loh, 
L., Downing, JR., Caiigluri, MA., Bloomfield, CD., Lander, ES. Molecular classification of cancer: dass discovery 
and class prediction by gene expression monitoring. Science 286:531 -537, 1 999 
55 Support Vector Machines: Su, At., Welsh, JB., Sapinoso, LM., Kem, SG., Dimitrov, P., Lapp, H., Schultz, PG., 

Powell, SM., Moskaluk, CA., Frierson, HF. Jr., Hampton, GM. Molecular ctassification of human carcinomas by 
use of gene expression signatures. Cancer Research 61:7388-93, 2001 and 

Ramaswamy, S., Tamayo, P., RIfkin, R., Mukherjee, S., Yeang, CH., Angeio, M., Ladd, C, Reich, M., Latu- 
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lippe. E., Mesirov, JR, Poggio, T, Gerald, W., Loda, M., Lander, ES., Gould, TR. Mutticlass cancer diagnosis using 
tumor gene expression signatures Proceedings of the National Academy of Sciences of the USA 98: 1 5 1 49-1 51 54. 

2001 

K-nearest Neighbors: Ramaswamy, S., Tannayo, P., Rifkin, R., Mukhenee, S., Yeang, CH., Angefo, M., Ladd, C, 
5 Reich, M., Latullppe, E., Mesirov, JR, Poggio, T, Gerald, W., Loda, M„ Lander, ES., Gould, TR. MulUclass cancer 

diagnosis using tumor gene expression signatures Proceedings of the National Academy of Sciences of the USA 
98:15149-15154,2001 

Congelation Coefficients: van 1 Veer LJ, Dai H, van de Vljver MJ, He YD. Hart AA, Mao M, Peterse HL, van der 
Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C. LInsley PS, Bemards R, Friend SH. 
10 Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002 Jan 31 ;41 5(6871 ):530-6. 

[0034] The gene expression profiles of this invention can also be used in conjunction with other non-genetic diag- 
nostic methods useful in cancer diagnosis, prognosis, or treatment monitoring. For example, in some circumstances 
it is beneficial to combine the diagnostic power of the gene expression based methods described above with data from 

15 conventional markers such as serum protein markers {e.g., carcinoembryonic antigen). A range of such markers exists 
including such analytes as CA1 9-9, CA 1 25, CK-BB, and Guanylyl Cyclase C. In one such method, blood is periodically 
taken from a treated patient and then subjected to an enzyme Immunoassay for one of the serum markers described 
above. When the concentration of the marker suggests the retum of tumors or failure of therapy, a sample source 
amenable to gene expression analysis Is taken. Where a suspicious mass exists, a fine needle aspirate is taken and 

20 gene expression profiles of cells taken from the mass are then analyzed as described above. Alternatively, tissue 
samples may be taken from areas adjacent to the tissue from which a tumor was previously removed. This approach 
can be particularly useful when other testing produces ambiguous. 

[0035] Combining the use of genetic markers with other diagnostics is most prefen-ed when the reliability of the other 
diagnostic is suspect. For example, it is known that serum levels of CEA can be substantially affected by factors having 

25 nothing to do with a patient's cancer status. It can be beneficial to conduct a combination gene expression/CEA assay 
when a patient being monitored following treatment for colon cancer shows heightened levels of routine CEA assays. 
[0036] Articles of this Invention Include representations of the gene expression profiles useful for treating, diagnosing, 
prognosticating, and otherwise assessing diseases. These profile representations are reduced to a medium that can 
be automatically read by a machine such as computer readable media (magnetic, optical, and the like). The articles 

30 can also include instructions for assessing the gene expression profiles in such media. For example, the articles may 
comprise a CD ROM having computer instructions for comparing gene expression profiles of the portfolios of genes 
described above. The articles may also have gene expression profiles digitally recorded therein so that they may be 
compared with gene expression data from patient samples. Alternatively, the profiles can be recorded in different 
representational fonnat. A graphical recordation Is one such fomnat. Clustering algorithms such as those incorporated 

35 in "GENESPRING" and "DISCOVER" computer programs mentioned above can best assist in the visualization of such 
data. 

[0037] Different types of articles of manufacture according to the invention are media or fonnatted assays used to 
reveal gene expression profiles. These can comprise, for example, microarrays in which sequence complements or 
probes are affixed to a matrix to which the sequences indicative of the genes of interest combine creating a readable 
40 detenminant of their presence. Alternatively, articles according to the Invention can be fashioned into reagent kits for 
conducting hybridization, amplification, and signal generation indicative of the level of expression of the genes of in- 
terest for detecting colorectal cancer 

[0038] Kits made according to the invention include formatted assays for determining the gene expression profiles. 
These can include all or some of the materials needed to conduct the assays such as reagents and instructions. 

45 [0039] The invention is further illustrated by the following non-limiting examples. 

Examples: Genes analyzed according to this invention are identified by reference to Gene ID Numbers in the GenBank 
database. These are typically related to full-length nucleic acid sequences that code for the production of a protein or 
peptide. One skilled In the art will recognize that identification of full-length sequences is not necessary from an ana- 
lytical point of view. That is, portions of the sequences or ESTs can be selected according to well-known principles for 

50 which probes can be designed to assess gene expression for the corresponding gene. 

Example 1- Sample Handling and LCM. 

[0040] Twenty-seven fresh frozen tissue samples were collected from patients who had surgery for a colorectal tumor. 
55 Nineteen of the samples were colorectal malignancy specimens, and eight of the samples were of nomial colon mucosa. 
The tissues were snap frozen in liquid nitrogen within 20-30 minutes of harvesting, and stored at -80C° thereafter For 
laser capture, the samples were cut (6^m). and one section was mounted on a glass slide, and the second on film (P. 
A.L.M.). which had been fixed onto a glass slide (Micro Slides Colorfrost, VWR Scientific, Media, PA). The section 
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mounted on a glass slide was after fixed in cold acetone, and stained with Mayer's Haematoxylin (Signna, St. Louis, 
MO). A pathologist analyzed the samples for diagnosis and grade. The clinical stage was estimated from the accom- 
panying surgical pathology and clinical reports, using the Dukes classification. The section mounted on film was after 
fixed for five minutes in 100% ethanol, counter stained for 1 minute in eosin/100% ethanol (100M.g of Eosin in 100ml 

5 of dehydrated ethanol), quickly soaked once in 100% ethanol to remove the free stain, and air dried for 10 minutes. 
[0041] Two of the colorectal adenocarcinomas were of grade 1 , 1 0 of grade 2, and 5 of grade 3. One of the malignant 
samples was a carcinoid tumor of the caecum, and one a metastatic melanoma lesion. Two of the adenocarcinoma 
samples represented the mucinous subtype, and one the signet cell subtype. The Dukes staging of the adenocarci- 
nomas divided them as follows: Dukes A: 2, Dukes B: 5, Dukes C: 7, Dukes D: 3. Six of the adenocarcinomas had 

10 been in-adiated preoperatively. 

[0042] Before use in LCM, the membrane (LPC-MEMBRANE PEN FOIL 1 .35 jim No 81 00, P.A.L.M. GmbH Mikrolaser 
Technologie, Bernried, Gemiany) and slides were pretreated to abolish RNases, and to enhance the attachment of 
the tissue sample onto the film. Briefly, the slides were washed in DEP H2O. and the film was washed In RNase AWAY 
(Molecular Bioproducts, Inc., San Diego, OA) and rinsed in DEP H2O. After attaching the film onto the glass slides, 

'5 the slides were baked at +120°C for 8 hours, treated with TI-SAD (Diagnostic Products Corporation, Los Angeles, OA, 
1 :50 in DEP H2O, filtered through cotton wool), and Incubated at +37**C for 30 minutes. Immediately before use, a 1 0\l\ 
aliquot of RNase inhibitor solution (Rnasin Inhibitor 2500U=33U/(il N211 A, Promega GmbH, Mannheim. Germany, 
0.5^i^ in 400jil of freezing solution, containing 0. 1 5 mol NaCI, 1 0 mmol Tris pH 8.0. 0.25 mmol dithiothreitol) was spread 
onto the film, where the tissue sample was to be mounted. 

20 [0043] The tissue sections mounted on film were used for LCM. Approximately 2000 epithelial cells/sample were 
captured using the PALM Robot-Microbeam technology (P.A.L.M. Mikrolaser Technologie. Cari Zeiss, Inc., Thornwood, 
NY), coupled into Zeiss Axiovert 135 microscope (Cari Zeiss Jena GmbH, Jena, Gemnany). The surrounding stroma 
in the nomrial mucosa, and the occasional intervening stromal components In cancer samples, were included. The 
captured cells were put in tubes in 100% ethanol and preserved at -80*C. 

25 

Example 2- RNA Extraction and Amplification. 

[0044] Zymo-Spin Column (Zymo Research, Orange, CA 92867) was used to extract total RNA from the LCM cap- 
tured samples. About 2 ng of total RNA was resuspended in 10 ul of water and 2 rounds of the T7 RNA polymerase 
30 based amplification were perfomned to yield about 50 ug of amplified RNA. 

Example 3- cDNA Microarray Hybridization and Quantitation. 

[0045] A set of cDNA microan^ays consisting of approximately 20.000 human cDNA clones was used to test the 
35 samples. About 30 plant genes were also printed on the microan^ays as a control for non-specific hybridization. Cy3-la- 
beled cDNA probes were synthesized from 5 ug of aRNA of the LCM captured cells. The probes were purified with 
Qiagen's Nucleotide Removal Columns and then hybridized to the microarrays for 1 4-1 6 hours. The slides were washed 
and air-dried before scanning. cDN A microarrays were scanned for cy3 fluorescence and ImaGene software (Biodis- 
covery, Los Angeles, CA) was used for quantitation. For each cDNA clone, four measurements were obtained using 
"^0 duplicate spots and duplicate an^ays and the intensities were averaged. 

[0046] cDNAs were printed on amino silane-coated slides (Coming) with a Generation ill Micro-array Spotter (Mo- 
lecular Dynamics). The cDNAs were PCR amplified, purified (Qiagen PGR purification kit), and mixed 1:1 with 10 M 
NaSCN printing buffer. Prior to hybridization micro-arrays were incubated in isopropanol at room temperature for 10 
min. The probes were incubated at 95**C for 2 min, at room temperature for 5 min, and then applied to three replicate 
^ slides. Cover slips were sealed onto the slides with DPX (Fluka) and incubated at 42**C overnight. Slides were then 
washed at 55**C for 5mln in IX SSC/0.2% SDS and 0.1X SSC/0.2% SDS, dipped in 0.1X SSC and dried before being 
scanned by a Genlll Array Scanner (Molecular Dynamics). The fluorescence intensity for each spot was analyzed with 
AUTOGENE software (Biodiscovery, Los Angeles). 

[0047] Chip intensities were linearly nonnalized forcing the intensity reading at the 75^ percentile equivalent to a 
so value of 100 on each chip. Every gene on the chip was nomnallzed to itself by dividing the intensity reading for that 
gene by the median of the gene's expression value readings over all the samples. Prior to clustering, genes that did 
not have an intensity reading of 1 00 or greater in at least one sample were filtered out in order to limit the background 
affect on the similarity metrics. A set of 6,225 genes was selected for clustering analysis. Hierarchical clustering was 
performed using congelation as ) a measure of similarity, which groups together samples with genes that are showing 
S5 positive changes at the same time without any consideration for negative changes (Silicon Genetics. Sunnyville, CA). 
Each of the major nodes in the dendrogram was then considered a subgroup of samples. Differentially expressed 
genes were identified by comparing each tumor subgroup to the nomnal group. The selection was based on a signal 
to noise measurement threshold given by 
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1 < 



where \Lf is the mean of the tumor subset, \l„ is the mean of the subset of nomnal samples, and Of + o„ represent the 
combined standard deviations. The within-group coefficient of variation of the intensity readings of a gene had to be 
less than 0.33, for the gene to be included in the pair-wise comparisons. The median of the tumor group over the 
median of the nomial group had to ) be greater than, or equal to 2 for up-regulation, and less than, or equal to 0.5 for 
down-regulation. If a gene met all the criteria, it was selected. The genes selected in all the comparisons were con- 
sidered consistently dysregulated In colorectal cancer. The p-values for the statistical significance were calculated 
using a T-test assuming unequal variance. The gene set for clustering was also subjected to principal component 
analysis (PCA) using a software package (Partek, St Louis, MO). The data was then projected onto the reduced 3-dl- 
mensional space. The nomnal and tumor colorectal samples were represented by the projected expression levels. 
[0048] A list of genes with large up-regulated differentials was created to distinguish between the tumor and normal 
samples. One-hundred and twenty-three genes were preselected by using 



0.5^ 



as a signal to noise cutoff. A ratio equal to, or greater than 1 .5 was the minimal criterion for up-regulation. Genes were 
also included if 



0.9 < 



A portfolio of four genes was established, each having at least a three fold expression differential between tumor and 
nomial cells. 

[0049] Differentially Expressed Genes in Colorectal Cancer. Thirty-nine genes were differentially expressed In all 
tumor samples as compared to normal colon mucosa. Thirty-seven of them were significantly down-regulated in all 
the tumors, except for an outlier. Two of them were up-regulated. The Identities of the genes were verified by sequencing 
the cDNA clones placed on the microan'ay. Results are shown In Table 1 . 



Table 1 



Modulated Genes 


ACCESSION 


GENE 

DESCRIPTION 


MEAN SIGNAL 

INTENSITY 

(NORMAL) 


MEAN SIGNAL 

INTENSITY 

CrUMOR) 


P-VALUE 




AF071569 


CaM kinase 11 gene 
subtype delta 2. 


93 


39 


4.64E-09 


Seq. ID No. 1 


AB014530 


Homo sapiens 
mRNA for 
K1AA0630 protein 


108 


50 


4.83E-07 


Seq. ID No.2 


AK000319 


Human cDNA 
KIAA0630 


236 


69 


7.84E-06 


Seq. ID No.3 


U81504 


beta-3A-adaptin 
subunit of the AP-3 
complex mRNA, 


241 


75 


3.52E-05 


Seq. ID No.4 


AB011166 


Human cDNA 
K1AA0594 


116 


55 


3.53E-05 


Seq. ID No.5 


AB040914 


Human cDNA 
KIAA1481 


187 


59 


8.85E-05 


Seq. ID No.6 
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Table 1 (continued) 





Modulated Genes 


5 


ACCESSION 


GENE 

DESCRIPTION 


MEAN SIGNAL 

INTENSITY 

(NORMAL) 


MEAN SIGNAL 

INTENSITY 

(TUMOR) 


P-VALUE 






AK025205 


Human cDNA 
FU21552 


322 


97 


0.00013 


Seq. ID No.7 


10 


AJ278219 


Fatty acid 
hydroxylase 


143 


53 


0.00011 


Seq. ID No.8 




AB046854 


Human cDNA 
KIAA1634 


142 


59 


0.00020 


Seq. ID No.9 


15 


R00585 


Unl<nown 


149 


57 


1 .28E-09 


Seq. IDNo.10 




845844 


Spi-B transcription 
factor 


140 


43 


0.00043 


Seq. ID No. 11 


20 


X98311 


Carcinoembryonic 
antigen family 
member 2 (CGM2) 


6137 


223 


0.00044 


Seq. 10 No.12 


25 


BAA78050 


NADPH * 

oxidoreductase 

homolog 


153 


84 


0.00048 


Seq. ID No.40 


N72128 


Unknown 


164 


77 


0.00068 


Seq. ID No.13 




AB040955 


Human cDNA 
KIAA1552 


334 


120 


0.00067 


Seq. ID No.14 


30 


AF125101 


HSPC040 protein 


363 


115 


0.0011 


Seq. ID No.15 


AB023229 


Human cDNA 
KIAA1012 


263 


88 


0.00099 


Seq. ID No. 16 


35 


N95761 


a-L-fucosldase 
gene 


429 


104 


0.00047 


Seq. ID No. 17 


AK025033 


Human cDNA 
FU21380 


180 


85 


0.0010 


Seq. ID No. 18 


40 


L10B44 


Human cellular 
growth regulating 
protein 


206 


101 


0.0013 


Seq. ID No. 19 




H96534 


H.saplens mRNA 
for gp25L2 protein. 


147 


58 


0.0015 


Seq. ID No.20 


45 


AK001521 


Human cDNA 
FU10659 


157 


60 


0.0019 


Seq. ID No.21 




AF151039 


HSPC205 protein 


117 


60 


0.0017 


Seq. ID No.22 




AF052059 


SEL1L protein 


168 


53 


0.0016 


Seq. ID No.23 


50 


N24597 


Unknown 


166 


62 


0.0016 


Seq. ID No.24 


AK001950 


Inner centromere 
protein 


148 


64 


0.0029 


Seq. ID No.25 


55 


BAA02649 


Macrophage 
scavenger receptor 
type 1 


118 


44 


0.0031 


Seq. ID No.41 




N75004 


Unknown 


98 


48 


0.0031 


Seq. ID No.26 
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Table 1 (continued) 





Modulated Genes 


5 




DESCRIPTION 


ivtCMri oiuinML 

INTENSITY 

(NORMAL) 


INTENSITY 
(TUMOR) 


D.VAI 1 IB 






W16916 


Human cDNA 
KIAA0260 


162 


61 


0.0037 


Seq. ID No.27 


10 


X52001 


H. sapiens 
endothelin 3 

mRNA. 


89 


33 


0.0042 


Seq. ID No.28 




T50788 


Unknown 


364 


102 


0.0059 


Seq. ID No.38 


15 


AJ005866 


Putative Sqv-7 like 
protein 


381 


163 


0.0049 


Seq. ID No.29 




AF113535 


MAID protein 


218 


100 


0.0053 


Seq. ID No.39 


20 


AB037789 


Human cDNA 
KIAA1368 


164 


62 


0.0068 


Seq. ID No.30 




M33987 


Carbonic 
anhydrase 


652 


46 


0.0074 


Seq. ID No.31 


25 


M77830 


Desmoplakin 1 
(DPI) 


184 


81 


0.0092 


Seq. ID No.32 




H81220 


EST domain 

letiloUil|JllUll idUlUi 

ELF1 


113 


55 


0.017 


Seq. ID No.33 


30 


AF000592 


Human 
chromosome 
21q11-q21 
genomic clone 


33 


69 


1.16E-05 


Seq. ID No.35 


35 


AK021701 


Human cDNA 
FLJ11639 


31 


63 


0.00070 


Seq. ID No.36 



Example 4: Optimized Portfolio for Colorectal Tumors. 



[0050] The mean-variance optimization algorithm was used to generate a multiple genebased signature, where the 
genes that are included can be used in combination to distinguish between the normal and tumor samples. Intensity 
measurements were processed using the samples and mtcroarrays described in Examples 1 -3. The data to be analyzed 
was first preselected based on a pre-specified 5-fold differential between tumor and nonmal cells. The expression data 
from genes preselected according to this criteria were then used as follows. The mean and standard deviation of the 
intensity measurements for each gene were calculated using the non-metastatic samples as the baseline. A discrim- 
inating value of X*(Standard Deviation + Mean) was then calculated for each baseline gene ( X was assigned a value 
of 3). This value was used to ensure the resulting portfolio would be stringent. A ratio of the discriminating value to the 
baseline value was then cak:utated for each metastatic sample. This ratio was then converted to a common logarithm. 
This data was then imported into Wagner Software which produced an effteient frontier from which a portfolio of 4 
genes was selected. The set included an unknown sequence, procollagen type I, large subunit of ribosomal protein 
L21 andflbronectln. These genes are identified as Seq. ID No 42, Seq. ID No. 43, Seq. ID No. 44 and Seq. ID No. 45. 
Alternatively, a combination of genes used to make up the portfolio can be used to produce diagnostic information that 
is useful for making clinical decisions regarding colorectal cancer. This is particularly beneficial in the case when a 
combination of genes selected from the portfolio are combined with additional markers (genetic or not). 
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Optimized Gene Portfolio: 
[0051] 

>gil1264443lgblN921 34.1 IN92134za23f09.r1 Soares fetal liver spleen 1 NFLS Homo sapiens cDNA clone IMAGE: 
293417 5' similar to gblM87908IHUMALNE32 Human carcinoma cell-derived Alu RNA transcript, (rRNA); gb: 
X57025_mal INSULIN-LIKE GROWTH FACTOR lA PRECURSOR (HUMAN) 

>gil2221047lgblAA490172.1IAA490172 ab06b08.s1 Stratagene fetal retina 937202 Homo sapiens cDNA done 
IMAGE:839991 3' similar to gb:J03464 PROCOLLAGEN ALPHA 2(1) CHAIN PRECURSOR (HUMAN) 
>gil2188918lgblAA464034.1IAA464034 zx86b09.r1 Soares ovary tumor NbHOT Homo sapiens cDNA clone IM- 
AGE:810617 5' similar to SW:RL21_HUMAN P46778 60S RIBOSOMAL PROTEIN L21. 
>gll834491 lgblR62612.1IR62612 yi12d01 .si Soares placenta Nb2HP Homo sapiens cDNA clone IMAGE:139009 
3' similar to gb:X02761_cds1 FIBRONECTIN PRECURSOR (HUMAN); 

[0052] Using a different set of criteria but the same method, a further four gene portfolio was selected by the software. 
These are Seq. ID no. 46, Seq. ID No. 47, Seq. ID No. 48 and Seq. ID No. 49. Two genes overlap with the first four- 
gene portfolio. The two optimized portfolios can also be combined to form a six-gene portfolio. 

Optimized Gene Portfolio: 

[0053] 

>gil2114953lgblAA431 245.1 IAA431 245 zw78d06,r1 Soares_testis_NHT Homo sapiens cDNA clone IMAGE: 
782315 5' similar to WP:F36H1 .2 CE05814 ANKYRIN LIKE 

>gil21 561 72lgblAA443497.1 1AA443497 zw34d03.r1 Soares ovary tumor NbHOT Homo sapiens cDNA clone IM- 
AGE:771173 

>gil2221047lgbi2iAA490172 ab06b08.s1 Stratagene fetal retina 937202 Homo sapiens cDNA clone IMAGE: 
839991 3' similar to gb:J03464 PROCOLLAGEN ALPHA 2(1) CHAIN PRECURSOR (HUMAN) 
>gil1264443lgblN92134.1IN92134za23f09.r1 Soares fetalliver spleen 1 NFLS Homo sapiens cDNA clone IMAGE: 
293417 5* similar to gblM87908IHUMALNE32 Human carcinoma cell-derived Alu RNA transcript, (rRNA); gb: 
X57025.mal INSULIN-LIKE GROWTH FACTOR lA PRECURSOR (HUMAN); 
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SEQUENCE LISTING 

<110> WANG, YIXIN 

<120> COLORECTAL CANCER DIAGNOSTICS 

<130> CDS 267 US NP 

<140> TBD 

<141> 2003-03-21 

<150> 60/368,798 
<151> 2002-03-29 

<160> 49 

<170> Patentln version 3.1 

<210> 1 

<2M> 1500 

<212> DNA 

<213> human 

<400> 1 

atggcttcga ccaccacctg caccaggttc acggacgagt atcagctttt cgaggagctt 
60 

ggaaaggggg cattctcagt ggtgagaaga tgtatgaaaa ttcctacUgg acaaggatat 
120 

gctgccaaaa ttatcaacac caaaaagctt tctgctaggg atcatcagaa actagaaaga 

180 

gaagctagaa tctgccgtct tttgaagcac cctaatattg tgcgacttca tgatagcata 
240 

tcagaagagg gctttcacta cttggtgttt gatttagtta ctggaggtga actgtttgaa 

300 

gacatagtgg caagagaata ctacagtgaa gctgatgcca gtcattgUat acagcagatt 
360 

ctagaaagtg ttaatcattg tcacctaaat ggcatagttc acagggacct gaagcctgag 
420 

aatttgcttt tagctagcaa atccaaggga gcagctgtga aattggcaga ctttggctta 
480 

gccatagaag ttcaagggga ccagcaggcg tggtttggtt ttgctggcac acctggatat 
540 

ctttctccag aagttttacg taaagatcct tatggaaagc cagtggatat gtgggcatgt 
600 

ggtgtcattc tctatattct acttgtgggg tatccaccct tctgggatga agaccaacac 
660 

agactctatc agcagatcaa ggctggagct tatgattttc catcaccaga atgggacacg 
720 

gtgactccLg aagccaaaga cctcatcaat aaaatgctta ctatcaaccc tgccaaacgc 
780 

atcacagcct cagaggcact gaagcaccca tggatctgtc aacgttctac tgttgcttcc 
840 
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atgatgcaca gacaggagac tgtagactgc 
900 



aagggtgcca tcttgacaac tatgctggct 
960 



ttgaagaaac cagatggagt aaaggagtca 
1020 



gaagatgtga aagcacgaaa gcaagagatt 
1080 



atcaacaatg gggactttga agcctacaca 
1140 



gaacctgaag ctttgggtaa Cttagtgqaa 
1200 



aatgctttgt ccaaaagcaa taaaccaatc 
1260 



ctggtagggg atgatgccgc ctgcatagca 
1320 



agtggaatgc caaagacaat gcagtcagaa 
1380 



aagtggcaga atgttcattt tcatcgctcg 
1440 



tgtattccaa atgggaaaga aaacttctca 
1500 



<210> 2 

<211> 5761 

<212> DNA 

<213> human 

<400> 2 

cacaccgcag tatgcggtgc cctttactct 
60 

tgaacagacL gccgctgLac tggcgtggcc 
120 



aacttggcaa cagttgcctg gggtagctct 
180 



tccagaggcG atggggagtg gacagcagct 
240 



caaccagtac agcactatca tgcagcagcc 
300 



cactgctcag cctctgaatg ttggtgttgc 
360 



cctcccttcg aagaagaata agcagtcagc 
420 



tctgccttcc caagtctatt ctctggttgg 
480 



taattccttg gtccctgtcc aagatcagca 
540 



ttgaagaaat ttaatgctag aagaaaacta 

acaaggaact tctcagcagc caagagcctg 

actgagagtt caaatacaac aattgaggat 

atcaaagtca ctgaacaact gatcgaagct 

aaaatctgtg acccaggcct tactgccttt 

gggatggatt ttcaccgatt ctactttgaa 

cacactatta ttctaaaccc tcatgtacat 

tatattaggc tcacacagta catggatggc 

gagactcgtg tgtggcaccg ccgggatgga 

gggtcaccaa cagtacccaL caagccaccc 

ggaggcacct ctttigtggca aaacatctga 

gagctgcgca gccggccggc cggcgctggt 
tggagggact cagcaaaCtc tcctgccttc 
acacaactct gtccagccca cagcaatgat 
agctgactgg aggaatgccc actctcatgg 
atccttgctg actaaccatg tgacattggc 
ccatgttgtc agacaacaac aatccagttc 
tccagtctct tccaagtcct ctctagatgt 
gagcagtccc ctccgcacca catcttctta 
tcagcccatc atcattccag atactcccag 
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ccctcctgtg agtgtcatca 
600 



caagcccaqt agctctggac 
660 



tgattctcca gactctgact 
720 



tctccgaggc aatagtggat 
780 



tggcacccgc actatcattg 
840 



aacccaggcc tcaggtctcc 
900 



gtcatctgga trgctgtatca 
960 



agcacaacca ctcaatctta 
1020 



gagaagcagc aacccagccc 
1080 



cccctacacc ttccagcatq 
1140 



ggcccctgct cacctgccaa 
1200 



tgctgcactg ggctcaacca 
1260 



gcatgctgca gcctatacca 
1320 

tgggcccagc ctcctcactt 
1380 



cacccaatcc tacattgggt 
1440 



tcctaccaag atcagccagt 
1500 



atggctacct tctcctggcc 
1560 



ccctcttgaa atttcttagc 
1620 



tttctctggg ggaacctgtc 
1680 



cctattttta aattcattat 
1740 



cccatcttct gcagttacca 
1800 

tttgtctctc tgacttgatt 
1860 



ctatccgaag tgacaclgat 
tgaagccaag gCctaatgtc 
cttctttgag cagcccttat 
ccgttttgga ggggcctggc 
tgcctccact gaaaactcag 
tgagcaataa gactaagcca 
cucccacagg gtatcgagct 
gccagaacca gcagtcatcg 
cccgcaggca gcaggcgCtt 
gcagcccgct acactcgaca 
gccaggctca tctgtatacg 
gctccattgc tcatcttttc 
ctcaccctag cactttggtg 
ctgccagcgt ggcccctgct 
cttcccgagg ctcaacaatt 
attcctactt atagttggtg 
ctgcgttctt aatattgggc 
cagcaacttg ttctgcaggg 
tcagtgttga ctgcattgtt 
ttttgt.gaca gtaattttgg 
aggaagagag attgttctga 
tctataaatg cttttaaaaa 

15 



gaggaagagg acaacaaata 
atcagttatg tcactgtcoa 
tccactgata ccctgagtgc 
agagttgtgg cagatggcac 
cttggtgact gcactgtagc 
gtcgcttcag tgagtgggca 
caacgcgggg ggaccagUgc 
gcggctccaa cctcacagga 
gtggcccctc tctcccaagc 
qggcacccac accttgcccc 
tatgctgccc cgacttctgc 
tccccacagg gttcctcaag 
caccaggtcc ctgtcagtgt 
cagtaccaac accagtttgc 
tacactggat acccgctgag 
agcatgaggg aggaggaatc 
tatggagaga tcctccttta 
gcccactgaa gcaqaaggtt 
gtagtcttcc caaagttbgc 
tacttggaag agttcagatg 
agttaccctc tgaaaaatat 
caagtqaagc ccctctttat 
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ttcatttttjt gttattgtga ttgctggtca 
1920 

tgatgacaaa aaaagaaaaa tcactttttg 
1980 

attttaaaag cggcttacac aatctccctt 
2040 

ttcagttttg ttttaatgtc atattatact 

2100 

gttacgtatt actctgtgtt actattgaga 
2160 

agtagtgttt aaaaggcagc tcaccatttg 
2220 

tgcgtgaaaa caccaagtat tctttttaaa 
2280 

tttttaaaag tctttctctc tctgattcag 
2340 

ggtggttatt attacatggt ggtggtggtt 
2400 

agatactggc attgatgagc tttgcctaaa 
2460 

tgttttgctc atctctccct tctgttttat 
2520 

cctgaaacca gataagaaca tttcttgtgt 
2580 

tgtatgccag cagcaaattg aatgctctct 
2640 

gaattgcaaa aaatatttta aaaatttatC 
2700 

taatggtggt gtCUtaatat tttacataat 
2760 

aacaagcaat ttttcctgct aacccaaaat 
2820 

acttgaattg tgtacttagt gtgtatgtga 
2880 

tgtctccatt gtatttaaac caaaatgaac 
2940 

tgcaattata utagagcata ttactgtagt 
3000 

agaggagacc cttggaattg ttttgcacag 

3060 

gtctcttcct tccctttctt cctccttccc 
3120 

ggttaataga gtttacagtg agcttgcctt 
3180 



ggaaaaatgc tgatagaagg agLLgaaatc 
tttgtttata aactcagact tgcctatttt 
ttgtttattg gacatttaao cttacagagt 
taatgggcaa ttgttatttc tgcaaaactg 
ttctctcaat tgctcctgtg tttgttataa 
ctggtaactt aatgtgagag aatccatatc 
tgaagcacca tgaattcttt tttaaattat 
cttaaatttt tttatcgaaa aagccattaa 
ttattatatg caaaatcUct gtctattatg 
gattagtatg aattttcagt aatacacctc 
gtgatttgtt tggggagaaa gctaaaaaaa 
atagctttta tacttcaaag tagcLtcctt 
tattaagact tatataataa gtgcatgtag 
actgaattta aaaatatttt agaagttttg 
taaatatgta catattgatt agaaaaatat 
gttatttgta atcaaatgtg tagtgattac 
tcctccagtg ttatcccgga gatggattga 
tgatacttgt tggaatgtat gtgaactaat 
gctgaatgag caggggcatt gcctgcaagg 
gtgtgtctgg tgaggagttt ttcagtgtgt 
ttattgtagt gccttatatg ataatgtagt 
aggatggacc agcaagcccc cgtggaccct 
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aagttgttca ccgggattta tcagaacagg 
3240 



ctcagtttcc ctgccaacat Cgaaaaataa 

3300 



tctacccctt tccatuttgg attctcggcL 
3360 



gctctctcac tgcgcgttgc taccttgctt 
3420 



gtcaagccaa tattaaatat gcattctttt 
3480 



ttttttttcc ttttcccatg tggcagtcct 
3540 



tatttgcttg ttgaaaaaaa catgttaaca 
3600 



tgcttaccat gtccccatac tatgaggaga 
3660 



tcacagaaag gtttcttagc tggtgaagaa 
3720 



attgaggctt ttgaggtttc tttLttaaca 
3780 



tgaaattgtc cttgtactct cagctcctgc 
3840 



gatggggaca ttcctgccca taaaggatCt 

3900 



tgtgttccat ccgaattgaa aatgatatat 
3960 



tagatagaga tggtgtcaag gaggtgcagg 
4020 



cagccagctc tgtaccaggt tgaacaccga 
4080 



ttgtaaggag taagggcttc caagatgggg 
4140 



gttgtgtttt ctttattttt taaaatcatt 
4200 



gtcaagatag ccaagcagtt tgtataattt 
4260 



aacatgtgtg atctttgtgt ctcctttttg 
4320 



acaggtctag tttctaaagg acaaattttt 
4300 



gatttgttgt ttttgtaaga aatgagatgc 
4440 



cccagtccaa taagcagata ccacttaaga 
4500 



attagtagct gtattgtgta atgcattgtt 
aaacagcagc ttttctcctt taccaccacc 
gagttctcac agaaqcattt tccccatgtg 
ctgtgagaat tcaggaagca ggtgagagga 
aaagtatgtg caaccacttt tagaatgaat 
tcctgcacat agttgacact cctagtaaaa 
gatgtgttta taccaaaqag cctgttgLat 
agttttgtgg tgccgctggt gacaaggaac 
tatagagaag gaaccaaagc ctgttgagtc 
gcttgtatag tcttggggcc cttcaagctg 
atggatctgg gtcaagtaga aggtactggg 
ggggaaagaa gattaatcct aaaatacagg 
ttgagatata attttaggac tggttctgtg 
atggagatgg gagatttcat ggagcctggt 
ggagctgtca aagtatttgg agtttcttca 
caggtagtcc gtacagccta ccaggaacat 
atattgagtt gtgttttcag cactatattg 
ctgtcactag tgtcatacag ttttctggtc 
ccaagcacat tctgatuttc ttgttggaac 
tgttccttgt cttttttctg taagggacaa 
aggaaagaaa accaaatccc attcctgcac 
taggagtcta aactccacag aaaaggataa 
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tar.caagagc ttgtattgtt accttagtca cttgcctagc agtgtgtggc tt.taaaaact 
4560 



agagattttt cagtcttagt ctgcaaactg gcatttccga tttcccagca taaaaatcca 
4620 



cctqtgtctg ctgaatgtgt atgtatgtgc tcactgtggc tttagattct gtccctgggg 
4 680 



ttagccctgt tggccctgac aggaagggag gaagccCggt gaatttagtg agcagcCggc 
4740 



ctgggtcaca gtgacctgac ctcaaaccag cttaaggctt taagtcctct ctcagaactt 
4800 



ggcatttcca acttcttcct ttccgggtqa gagaagaagc ggagaagggt tcagtgtagc 
4860 



cactctgggc tcatagggac acttggtcac tccagagttt ttaatagctc ccaggaggtg 
4920 

atattatttt cagtgctcag ctgaaatar.c aaccccagga ataagaactc catttcaaac 
4980 



agttctggcc attctgagcc tgcttttgtg attgctcatc cattgtcctc cactagaggg 
5040 



gctaagcttg actgccctta gccaggcaag cacagtaatg tgtgttttgt tcagcattat 

5100 



tatgcaaaaa ttcactagtt gagatggttt gttttaggaU aggaaatgaa attgcctctc 
5160 



agtgacagga gtggcccgag cctgcttcct attttgatt.t tttttttttt taactgatag 
5220 



atggtgcagc atgtctacat ggttgtttgt tgctaoactt tatataatgt gtggtttcaa 
5280 



ttcagcttga aaaataatct cactacatgt agcagtacat tatatgtaca ttatatgtaa 
5340 



tgttagtatt tctgctttga atccttgata ttgcaatgga attcctactt tattaaatgt 
5400 



atttgatatg ctagttattg tgtgcgattt aaactttttt tgctttctcc cttttttt.gg 
54 60 



ttgtgcgctt tcttttacaa caagcctcta gaaacagata gtttctgaga attactgagc 
5520 



tatgtttgta atgcagatgc acttagggag tatgtaaaat aatcatttta acaaaagaaa 
5580 



tagatattta aaatttaata ctaactatgg gaaaagggtc cattgtgtaa aacatagttt 
5640 



atctttggat tcaatgtttg tctttggttt tacaaagtag cttgtatttt cagtattttc 
5700 



tacataatat ggtaaaatgt agagcaattg caatgcatca ataaaatggg taaattttct 
5760 



g 

761 
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<210> 3 

^211> 2129 

<212> DNA 

<2i3> human 

<4 00> 3 

ctgtattyag acaaaggaag ggatctgtca gaaagcaaca cttgttatct tgggcttggc 
60 

agcaaggaag aggacaggta gtggagatcc tgcaatctga aaagcagact gaaaggtgac 
120 

aaagaagctg aagatgggtg gtggagagag gtataacatt ccagcccctc aatctagaaa 

180 

tgttagtaag aaccaacaac agcttaacag acagaagacc aaggaacaga attcccagaC 
240 

gaagattgtt cataagaaaa aagaaagagg acatggttat aactcatcag cagctgcctg 
300 

gcaggccatg caaaatgggg ggaagaacaa aaattttcca aataatcaaa gttggaattc 
360 

tagcttatca ggtcccaggt tactttttaa atctcaagct aatcagaact atgctggtgc 
420 

caaatttagt gagccgccat caccaagtgt tcttcccaaa ccaccaagcc actgggtccc 
480 

tgtttccttt aatccttcag ataaggaaat aatgacattt caacttaaaa ccttacttaa 
540 

agtacaggta taaaataaga caaatgttta aatttagtta tgttcacggg tagttgtcaa 
600 

ttggtctgaa acaaatttgc tagggaatct atttgtgtag aactaattaa tgtaaaaaaa 
660 

atagaccatc tcgtgttgtg tgcactgtga tataatggta gtatcagtgc aacttaaact 
720 

aatgattgta cttgatatta agtgttctca actgagtaac ttttaagtgg aaaccaagtt 
780 

tagatttggg gagtggtaaa ggaatcagct ttttctattg ttaggggaag acagtaattt 
840 

atcattcatg gaccagtaga ttgttgaaag ttggtgaatc ggattataag cttctagcta 
900 

acacaaggat tcagaattag gtaaacatct gaaggtttag tatattagaa acacccaaac 
960 

cagtaatatg ctaacctgat gcactgctga aagaaaatgt gaatttttcg taataattgc 
1020 

attttagtga attgtacagt gggtggaaag ggcacttgga gctcattaga atgagacata 
1080 

gtacacccca atggccctgt ttattaaatg tagtggatta agtgtctgtc aacaaataca 
1140 

ccaaaaccat tttttataga aacagtattt aatggtcact caatagcttt caaaatacat 
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1200 



ttttgtatta cagcactgca caagctattc taatagcgct ctqqcctcat cattcctgca 
1260 



aagctCgctt tggggagttg galaaLgtga aaattttaaa tacctagggg agaaagagcc 
1320 



atgtaaatat ctgtaataaa cttgtagcat atgtaaagtt ttcttggcct ttatcttaca 

1380 



aaaatggagt attttagtat gaatttgctg aatgtaagac cgtggactgt tttttataat 
1440 



atggcctaat tttaaaggtc caaaataact tgtttttaaa gtttgccctt gtgctaaagt 
1500 



gccagtgtat gtatgttata cttgatttgg ttgtaaacta tatttcaaag taaaccctag 
1560 



tgtaataagt tttataacta aaaaggttta agctgctaaa actattttta agagatgtga 

1620 



aatgcagtat gggactatct ttttttcctc ctctaagccc aaagattaac tagagtccct 
1680 



ccaaccttat agattgttgg ctttcacaat cttataacct aggatacagg tagtttcgag 
1740 



tatggtgcca gtgatgtttt gtttttgttt ggtcaagggg taggtgcaac ccaatggacc 
1800 



acttatgcaa aagatgtaaa ctcttgcata atacattgat aacatgtttt gccaacttta 

1860 



aatgcttaaa cataagcgaa accagtagca agtatgtggg tcagcttaaa aattttgatt 
1920 

gttaatgccc tattttctaa tttggcacct cttgatgcct aagcaggtaa gcagatgcct 
1980 



aagctgtatt tctccaaata aatcaagatg aagtactgcc caagttaaat attgatagcc 
2040 



taaagacaag ttuatgtagt acttaatgta catgatatga agcataaaat taaataaaat 

2100 



ttttccccat tgaaaaaaaa aaaaaaaaa 
2129 



<210> 4 

<211> 3950 

<212> DNA 

<213> human 

<400> 4 

cgagaactag ttttgttccg tgccctctgg actggaacct tttggagaga acccccggca 
60 

ggaccaaccc cgcacccgcc agcaccgcgg caatgtccag caatagtttt ccttacaatg 
120 



agcagtccgg aggaggggag gcgacggagc tgggtcagga ggcgacctca accatttccc 
180 
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cctcggqgqc cttcggcctc tttagcagcg 
240 

tgLtagagag caacaaagat tctgctaaac 
300 

ttgcaaaagg gaaaaatgca tctgaactgt 
360 

aaaatattga gatcaagaag ttggtatatg 
420 

aggatcttgc actcctgtcc ataagcactt 
480 

taattcgr.gc aagcgctttg agagttctgt 
540 

tcatgatgct tgctattaag gaagcttctg 

600 

cagcccatgc aatacaaaaa ttatacagcc 
660 

aagtaattga aaaacttctg aaagataaaa 
720 

cttttgaaga agtatgcccg gacagaatag 
780 

gtaacttact agtggatgtt gaagagtggg 
840 

gatatgctcg gacacagttt gtcagccctt 
900 

gaaagaattt ctacgaatct gatgatgatc 
960 

cgtatactat ggatccagat catagactct 
1020 

gcaggaatgc tgcggtggtt atggcagttg 
1080 

ctgaagctgg cataatttct aaatcactag 
1140 

agtatattgt cctacaaaat atagcaacta 
1200 

cttatctgaa gagtttctat gttaggtcaa 
1260 

ttgaaatttt gacaaacttg gcaaatgaag 
1320 

agacctatgt gaaaagccag gataaacaat 
1380 

gaLgLgcaac caacatcttg gaagtcactg 
1440 

tgtccaacag ggatgaaata gttgttgctg 
1500 



atttgaagaa gootgaagat ctaaagcaaa 
tggatgctat gaagcggatt gctgggatga 
ttcctgctgt tgtgaagaat gtggccagta 
tttacctggt tcgatatgct gaagaacagc 
ttcagcgagc tctgaaggac ccaaaccaac 
caagtattag agtgccaatt attgtaccta 
ctgacttatc accatatgtt aggaagaatg 
ttgatccaga gcagaaggaa atgttaattg 
gcacattggt agctgqcagt gttgtgatgg 
atctgattca taaaaattac cgcaagctat 
ggcaggttgt cataatccac atgctaactc 
ggaaagaggg tgatgaatta gaagacaatg 
agaaggaaaa gactgacaaa aagaagaagc 
taattagaaa tacaaagcct ttgcttcaga 
ctcagctgta ttggcacata tcaccaaaat 
tgcgtttact tcgtagcaat agggaggtgc 
tgtcaattca aagaaagggq atgtttgaac 
ctgatccaac tatgatcaag acactgaagc 
ccaacataLc aactcttctt cgagaatttc 
ttgcagcagc cactattcag actataggca 
acacgtgcct caatggcttg gtctgtctgc 
aaagtgtggt tgttataaag aaattactgc 
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aaacgcaacc tqcacaacat cyytcjaadtLa 

1560 

tcactgttcc tgttgctarja grraagtattc 
1620 

ttcctaaaat tgcccctgat gttttgagga 
1680 

atctggtaaa actgcagata ttaaatctgg 
1740 

agacaaaatt gcttacccag tacatattaa 

1800 

tcagagaccg tacaagattt attaggcagc 
1860 

taagtaaata tgccaaaaaa atattcctag 

1920 

cttttaaaga tagagatcat ttccagcttg 
1980 

ctactgggta ccCggaatta tctaattggc 
2040 

atgtagaagt aatagagttg gcaaaagaat 
2100 

attctgctao gaagttttat tctgaatctg 
2160 

gtgacagtga gagtgaatct ggaagtgaaa 
2220 

gagacagcaa tgaggacagc agtgaggact 

2280 

gggagtcagg cctagaaaac aaaagaacag 
2340 

gtgattctga agatggggag aaggaaaatg 
2400 

acgaatctag ttcaatagaa gacagttctt 
2460 

gtgaatctga atccagaaga gtcactaagg 

2520 

ctcctcttac caaagatgtt tcacttctag 
2580 

cagttgcact tcccacacca gctctttctc 
2640 

acttgtcaac ttcctcttca gtcatcagLg 
2700 

ctcacgtgct gcttcatcga atgagtggaa 
2760 

gacagccttg catttttggt gataagatgg 
2820 



LL<*<*acata!: ggccaaactc ctggacagto 
tttggctaat tggagaaaac tgtgaacqag 
agatggctaa aagcttcact agtgaagatg 
gagcaaaatC gt".attt:aacc aactccaaac 
atctcggcaa gtatgatcaa aactacgaca 
ttattgttcc gaatgaaaag agtggagctt 
cacaaaagcc tgcaccactg cttgagtctc 
gcaccttatc tcatactctc aacattaaag 
cagaggtggc gcccgaccca tcaqttcgaa 
ggaccccagc aggaaaagca aagcaagaga 
aggaagagga ggactcttct gatagtagca 
gtggagaaca aggcgaaagt ggggaggaag 
cctccagtga gcaggacagt gagagtggac 
ccaagaggaa ctcaaaagcc aaaggaaaaa 
aaaaatctaa aacttcagat tcttcaaatg 
ccgattctga atcagagtca gaacctgaaa 
agaaagaaaa gaaaacaaag caagatagaa 
atctggatga ttttaaccca gtatccacLc 
caagtttgat ggctgatctt gaaggtttac 
Lcagtactcc tgcatttgta ccaacgaaaa 
aaggactagc tgcccattat ttctttccaa 
tctctataca aataacactg aataacacta 



22 



EP1 a55 149 A2 



ctgatcgaaa gatagaaaat atccacatag gggaaaaaaa actLcctata ggcatgaaaa 
2880 

tgcatgtttt taaUccaata gactctcttg agcctgaggg atccattaca gtttcaatgg 
2940 

gtattgactt ttgtgattct actcagactg ccagtttcca gttgtgtacc aaggatgatt 

3000 

gcttcaatgt taatattcag ccacctgttg gagaactgct tttacctgtg gccatgtcag 
3060 

agaaagattt taagaaagag caaggagtgc taacaggaat gaatgaaact tctgctgtaa 
3120 

tcattgctgc accacagaat ttcactccct ctgtgatctt tcagaaggtt gtaaatgtag 
3180 

ccaatgtagg tgcagtccct tctggcrcagg ataatataca caggtttgca gctaaaactg 
3240 

tgcacagtgg gtcaLtgatg ctagtcacag tggaactgaa ggaaggctct acagcccagc 
3300 

ttatcataaa cactgagaaa actgtgattg gctctgttct gctgcgggaa ctgaagcctg 

3360 

tcctgtctca ggggtaacct gcttacatct ggactttaga atctggcaca caacaaaagt 
3420 

gcctggcatc cactactgct gcctttcatt tataataata gcccttccat ctggcagtgg 
3480 

gggtagaata cactcttgac attcttgtct cctgctttag aatgctagtg tgtatctatc 
3540 

atgtatgcaa tactttcccc ctttttgctt tgctaaccga agagcatata ttttactgtc 

3600 

agttgtctca actcttgaat ccatgtggcg ttf.tctctgt cctgctgctt cttttggcct 
3660 

cctcgttttc cttctctttt tcgacaatgg tagacatgaa tgagatattt aaagttcatt 
3720 

ggaaatcttc ttccctacag cagtaagcaa aaattagcaa agagatagtc taaatggcct 
3780 

ctcagcttgg tatgtgaaaa tgagatcaca tactttttaa atccaaatac aaaagcatag 
3840 

tctctgcaag attttgttct ttgaatttct tgatattgta attgattatt gataactgtc 
3900 

atcatgaaat tatctctcaa taataagata aataaactag cataCgaatc 
3950 



<210> 5 

<211> 5191 

<212> DNA 

<213> human 

<4 00> 5 

gagaaagaaa aacagctcga gacctcatgc aaagagaaaa ctgagtatct acagaaaatg 
60 
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gttcagagga atgaaagata taaacaagat gtggagaggt tctatgaacg gaagcgacat 
120 

* ttagatttaa ttgagatgct tgaagcaaaa aggccatggg tggaatatga aaatgttcgt 

ISO 

caggaatacg aagaagtaaa actagttcgt gaccgagtga aggaagaggt cagaaaactt 
240 

'0 aaagaagggc agattcctat aacatgtcga attgaagaaa tggaaaacga gcgtcacaat 

300 

ttggaggctc qaatcaaaga aaaggcaaca gatattaagg aggcatctca aaaatgcaaa 
360 

'5 cagaagcaag atgttataga aaggaaagat aaacatattg aggaacttca gcaggcttta 

420 

atagcaaagc aaaatgaaga gcttgaccga cagaggagaa taggtaatac ccgcaaaatg 
480 

20 atagaggatt tgcaaaatga actaaagacc acggaaaact gcgagaatct tcagccccag 

540 

attgatgcca ttacaaatga Lctgagacgg attcaggatg aaaaggcatt atgtgaaggc 
600 

23 gaaataattg ataagcgaag agagagggaa actctagaga aggagaaaaa gagtgtggac 

660 

gatcatattg tacgttttga caatcttatg aatcagaagg aagataagct aagacagaga 
720 

30 ttccgtgaca cgtatgatgc tgttttatgg ctaagaaata acagagacaa atttaaacaa 

780 

agagtctgtg agcccataat gctcacgatc aatatgaaag ataataaaaa tgccaaatat 
840 

35 attgaaaatc atattccatc aaatgactta agagcctttg tatttgaaag tcaagaagat 

900 

atggaggttt tcctcaaaga ggttcgtgac aataaaaaat taagagtaaa tgctgttatt 
960 

40 gctcccaaga gttcatatgc agacaaagca ccttcaagat ctttgaatga acttaaacaa 

1020 

tacggatttt tctcUtattt gagagaatta tttgatgcac ctgatcctgt aatgagttac 
1080 

45 ctttgctgtc agtatcatat tcatgaagtt cctgtaggaa ctgaaaagac cagagaaaga 

1140 

attgaacggg taatacaaga aacccgatta aaacagactt atacagcaga agaaaagtat 
1200 

so gtggtgaaaa cttcttttta ttcaaacaaa gttatttcta gtaacacatc tctaaaagta 

1260 

gcgcagtttc tcactgtcac tgtggaccta gagcagagaa gacacttaga agaacagcta 
1320 

55 aaggaaattc atagaaaatt gcaagcagtg gattcagggt tgattgcctt acgtgaaaca 

1380 
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agcaaocatc tggagcacaa agacaatgaa cttagacaaa agaagaagga gcttcttgag 
1440 

agaaaaacca agaaaagaca actggaacaa aaaatcagtt ccaaactagg aagtttaaag 

ibUO 

ctgacggaac aggatacttg caatcttgaa gaggaagagc gaaaagcaag taccaaaatc 
1560 

aaagaaataa atgttcaaaa agcgaaactt gttaccgaat taacaaacct aataaaqatt 
1620 

tgtacttctt tgcatataca aaaagtagat ttaattctcc aaaatactac agtgatcUct 
1680 

gagaagaaca aattagaatc agattaLatg gccgcatctt cacaactccg tcttacagag 

1740 

caacatttca ttgaattgga tgaaaataga cagagattat tgcagaaatg caaggaactt 
1800 

atgaaaagag ctaggcaagt atgtaacctg ggtgcagagc aqactcttcc tcaagaatac 

1860 

cagacacaag tacccaccat cccaaatgga cacaactccL cactccccat ggttttccaa 
1920 

gaccttccaa acacaLtgga tgaaattgat gctttattaa ctgaagaaag atcaagagct 

1980 

tcctgcttca cgggactgaa tcctacaatt gttcaggaat atacaaaaag agaagaagaa 
2040 

atagaacagt taactgagga actaaaggga aagaaagttg aactagatca atacagggaa 
2100 

aacatttcac aggtaaaaga aaggtggctt aatccUtLaa aagagctggt agaaaaaatt 
2160 

aatgaaaaat tcagcaattt ttttagttcc atgcagtgtg ctggtgaagt tgatctccat 
2220 

acagaaaatg aggaagatta tgataaatat ggaattcgaa ttagagtcaa atttcgaagt 
2280 

agtactcaac tgcatgaatt aactcctcat catcaaagtq gaggtgaaag aagtgtttct 
2340 

accatgttat acttgatggc acttcaggag ctaaatagat gtccattcag agtagttgat 
2400 

gaaatcaatc agggaatgga cccaatcaat gaacggagag tgtttgaaat ggttgtaaat 
2460 

actgcctgta aagaaaatac atctcaatac tttttcataa caccaaagct cctgcaaaat 
2520 

cttccttatt ctgaaaagat gacagttttg tttgtctaca atggccctca tatgctggaa 
2580 

ccaaacacat ggaatttaaa gqctttccaa aggcggcggc gccgtattac attcactcaa 
2640 

ccttcttaat aaaagtaaag agagggaact tgggaatttt ttttgttaaa ttctgtttat 
2700 
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aagtatqgct caactrgaata aaaggagatt cactaaaacg aaaagcagtt attcttggaa 
2760 

s acctgctttt aaatacaaat aggttgataa tggaaactat aatgaccttt ccaaaatagc 

2820 

agctggtaqt aaaaguuaag tcttcttcag tcttggttga acttgagttc ttggcactct 
2880 

10 gaccatgagt cattcagttc tcatgttaaa atgtacttaa tattacaaat caaaggtaca 

2940 

gtggaagaag ggttaatcac aagaagttac ttatatggta gccctgagct ttaattgcag 
3000 

15 agtaacttta attactttta gagcctaaag atgactctag agcctaagtc ctagtttctc 

3060 

ccaatgttat atttaatttt aaaaaaUtga tatgaaaatg tctaatgtat agtaataatt 
3120 

20 tatgacagat ctagtcattt cttcctatta aaaaagatta ccttatctcc agtaggaaat 

3180 

ggaattttat gggcctttaa aagaaagttt tatgaaactt gatgctataa ttttattggt 
3240 

25 atttcaaggg gaaaaaagca ctggggttca aaaatggtag cagaactgct ttgaaatgct 

3300 

gcaaggtggc cactagatga Lgcaaaatac aaccaaaaga ttgactgaga ataaaattag 
3360 

30 gtgacaaggg tttttaaaga ataacctttt aaagtgtggg ggcaggggtt gctttttttt 

3420 

attttattta aagtcaatta tattttacat cttacatttc taaaagcatt ttataattat 
3400 

35 ttttagtaag atttttctta aaatttcata tactggtttc tacaatttaL aUttgaaatt 

3540 

tcLcagtgtt atgtaaagag tgatggaaaa gcattgattt ctttaaaacc gtaatgtttt 
3600 

40 tagaacttaa gcctataggg cctttcttac aatgttgatg tacccattat cttagaaaat 

3660 

ctagtttaaa ctgttttctt tcaccgcaaa agaattaaat gggaaaatca tttgtttatc 
3720 

45 tctaagttat actaattagt agaaccaaac aaattatctt cttttaaaaa ataaatctta 

3780 

CaggaaaaCa gacagtccaa agtcatgtct ttgaacagtg gattggatct gtgccagtaa 
3840 

50 tgacaaaatt atttttttga cttgcttgcc tgaataaatt gaagaattgc tttcagtttg 

3900 

ggttttgtat attcttaagt agccattgaa atttatattc ttaactaggt caaaaaataa 
3960 

„ tgagccataa gtttatgtcc tctcacttag acattttctc tttaaaaagg tattttcttc 

4020 
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tttataaaca ttttaaaaga gccttccctt cttaaactaa ctccagtgca tgaagtgtga 
4080 

aaatatttta aaatgacatt tttactaata tgagcaagtc atgtaaacat tgaagaactt 
4140 

ggtaacatat tagtaaatgg atattaccaa atgttttcat cgttaattac tttqcgttcc 
4200 

accaaaataU ctttactaaa atgtgcttgg tgtagtttgt ttattgLcta aattagtacc 
4260 

aglcatctta tttctgcaaa atgagtatca atgtgaaaaa gacacgtgaa gattaagcat 
4320 

gtttgaaaat aaaatggtca attacatttc aatttacata ggccaacaac tgttccatac 
4380 

tttgtttgta aacatttaat ttctctactg gacaaaatta atatttggct ttacattgaa 
4440 

ttttgagctg tgaagaataa attatgCatc attttagcat attaaacagfc agtaagtcta 
4500 

gcacatagtc tcagccactt aaaacaaaag tttttttgtt tgtttgtttg tttgtttttt 
4560 

tgagatggag tctcactctg ttgcccaggc tggagtgcag tggcgtgatc tcggcttact 
4620 



gcaacctccg cctcccgggt tcaagcgatt ctcctgcctc agcctcccaa gtaactggga 
4680 

caacaggcgc gtcccaccac acccagctaa ttttttatac ttttagtaga gatiggggttt 
4740 

cagcatattg gccaggctgg tctcgaactc ctgaccttgt gatccacccg cctcggcctc 
4800 

ccaaagtgct gggattatag gcgtgagccc ctgcacccgg ccaaaagttg atttttaatt 
4060 

acataaaaat cgtaaaaact tctagtaaaa acttgatttg gtgaatacag ttatatttta 
4 920 

aaaccltaag gtgacaagca ttttctatgc ctaaatcttc attggtttgc ctggaaagag 
4980 

tctctgttaa aagattttcc atattcaaag taaaaggaaa gatttcttgc ttcctaattg 
5040 



tcttttggac acatgcctat tttctttgag gtataaacct ttagatgtga aaaatgtaat 

5100 



ttcattctgc tattgtgtgt gcttgtgtgt gtgtaattga aaaaactggg aaaLcctqct 
5160 

ttgttggtaa taaatcaata tttttatatc c 
5191 



<210> 6 

<211> 4755 

<212> DNA 

<213> human 
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<400> 6 

aagagatctt ccaggctctc agagccctgg gagggcgatt tccaggaaga ccacaatgcc 

60 

aacctctgga ggaggctgga gagayaayyc clciyyccciya gcctgtcagg caactttggc 
120 

aagaccaagt cagccttctc atctctccag aacattcctg agagtctgag aagacacagc 
180 

agcctggagc taggccgggg aacccaggag ggttaccccg ggggcaggcc cacctgtgca 
240 

gtcaacacca aggcagaaga ccctgggagg aaagccgctc ctgacctcgg gagccatctg 
300 

gaccggcagg tttccUaccc gcggcccgag gggaggaccg gtgcctcggc ttctttcaac 
360 

agcacagacc caagtcccga agagccgcct gccccctcgc acccgcacac atccagtctq 

420 

ggccggaggg ggcccggccc aggcagcgcc tcqgctcttc agggctttca gtacyggaag 
480 

ccccactgct cggtgctgga gaaggtctcc aaatLcgagc agcgagagca agggagccag 
540 

agaccgagtg tgggcggctc tggttttggc cataactata ggccccacag gaccgtctca 
600 

acttccagta cttctgggaa tgacttcgag gagacaaaag cacacattcg tttctctgag 
660 

tcagctgaac ccctaggcaa cggggagcag cacttcaaaa acggggagct gaagttggaa 
720 

gaggcttccc ggcagccctg cggLcagcag ctgagcggag gagcgtcgga cagcggccgt 
780 

ggcccccaga ggccggacgc tcggctcctc cgtagccaga gcaccttcca gctctccagc 
840 

gagccagaga gggagcccga gtggcgggac aggcccggct cgcccgaatc gcccctgctg. 
900 

gatgccccct tcagccgcgc ctaccggaac agcatcaagg acgcacagtc ccgtgtcttg 
960 

ggggccacct cctttcgacg tcgagacctg gagctggggg cgcccgtggc gtcgaggtcc 
1020 

tggcggccac ggccttcctc ggcccacgtg gggctgcgga gccccgaggc gtcggcctcc 
1080 

gcctccccgc acacgccccg ggagcggcac agcgtgaccc ctgctgaggg cgacctggcc 

1140 

aggcccgtgc cccctgccgc ccggagaggt gctcgccggc gcctgactcc cgagcagaag 
1200 

aagcgctcct actcggagcc cgagaagatg aacgaggtgg ggaUcgLgga ggaggccgaa 
1260 

ccggcacccc tgggcccgca gagaaatggg atgcgtttcc cggagagcag cgtggccgac 
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1320 

cggcgccgtc tcttcgagcg cgatggcaag gcctgctcca cgctcagcct gtcggggccc 
1380 

gagctgaagc agctccaqca gagcgccctg gcgyacLaca tccagcgcaa gaccggcaag 
1440 

cggcctacct ccgccgccgg ctgcagcctc caggagcccg ggccactgcg tgagcgcgcc 
1500 

cagagtgcct acctccagcc cggccccgcg gcgctcgaag gctccggcct cgcctcqgcc 
1560 

tccagcttga gctcactgcg ggagcccagc ctgcagcccc gcagggaggc cacgctcctg 

1620 

ccggccacag ttgcagaaac ccagcaggct ccccgagatc gcagcagctc cttcgccggt 
1680 

ggccgccgcc tcggggaacg gcgacgcggg gacctgctta gcggagcaaa cggtggaaca 
1740 

aggggcaccc agagagggga tgagaccccc agggagccat cctcctgggg ggccagggcc 
1800 

gggaagtcca tgtcggccga ggacctgctg gaacgctcgg acgtccttgc gggccctgtc: 
1860 

catgtgaggt ccaggtcatc tcccgccacc gcagacaagc gccaggatgt gcttttgggg 
1920 

caagacagtg gctttggtct tgtgaaggat ccatgttatt tggctggtcc tggatctagq 
1980 

tcactcagtt gttcagaaag aggccaagaa gagatgctgc tgctcttcca ccatcLcacc 
2040 

cctcgttggg gtggttcagg ctgcaaagcc attggtgatt cctccgttcc tagtgaat.gt 
2100 

cctggaaccc tggaccatca gaggcaagcc agtaggacac cctgccccag gccaccactg 
2160 

gcaggaacgc aagggctggt cacagacacc agggctgcac ccctgacccc aattggcacc 

2220 

cctctgcctt cagccattcc ctctggctac tgctcacagg acggtcagac agggcgacag 
2280 

cctctcccqc cctacacccc tgccatgatg cacagaagca atggtcacac cctgacccag 
2340 

cctcccggtc caagaggctg tgagggcgat ggcccagagc atggggtaga agagggaacg 
2400 

aggaagaggg tctcgctgcc tcagtggcca cctccttctc gagcaaagtg ggcccacgca 
2460 

gccagagagg acagccttcc tgaggaatcc tcagcccctg attttgcaaa cctgaagcac 
2520 

tatcaaaaac agcagagtct tccaagttta tqcagcactt ctgacccaga cacaccuctt 
2580 

ggggccccga gcactccagg gaggatctcc ctccgaatat ctgagtctgt cctgcgggac 
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2640 

nccccgccac ctcatgagga ttatgaagac gaagtgtttg tgagggatcc gcaccccaag 
2700 

5 

gccacgtcca gccccacatt tgaacctctt cccccacccc cacctcctcc accgagtcag 
2760 

gaaaccccgg tgtatagcat ggatgacttc cctccacctc ctccccocac tgtatgtgag 
2820 

10 

gcgcagctgg acagtgagga tcccgagggg ccacgcccca gcttcaacaa actttctaaa 
2880 

gtgacaattg caagggaaag gcacatgcct ggtgcagccc atgtggtagg tagtcagaca 

2940 

15 

ctggcttcca gactccaaac LCctatcaag ggttcagagg ctgagtccac accaccctcc 
3000 

ttcatgagcg ttcacgccca acttgctggg tctcttggtg ggcagccagc acccatccag 

3060 

20 

actcaaagcc tcagccatga tccagtcagt ggaactcagg gtttagaaaa gaaagtcagt 
3120 

cctgatcctc agaagagttc agaagacatc agaacagagg ctttggccaa ggaaattgtc 
3180 

25 

caccaagaca aatctctagc agacattttg gatccagact ccaggctgaa gacaacaatg 
3240 

gacctgatgg aaggtttgtt tccccgagat gtgaacttgc tgaaggaaaa cagtgcaaag 
3300 

30 

aggaaggcca tacagagaac tgtcagctct tcaggatgtg aaggcaagag gaatgaagac 
3360 

aaggaagcag tgagcatgtt ggttaactgc cctgcctact acagtgtgtc tgctcccaag 
3420 

35 

gctgagctac tgaacaaaat caaagagatg ccagcagaag tgaatgagga agaggaacag 
3480 

gcagatgtca atgaaaagaa ggctgagctc attggaagtc tcacccacaa gctggagacc 
3540 

40 

ctccaggagg cgaaggggag cctgctcacg gacatcaagc tcaacaacgc cctgggagaa 
3600 

gaggtggagg ctctgatcag cgagctctgc aagcccaatg agtttgacaa gtataggatg 
3660 

45 

ttcatagggg atttggacaa ggtggtcaac ctgctgctct ccctctcggg gcgtctagcc 
3720 

cgtgttgaga atgtccttag cggccttggt gaagatgcca gtaatgaaga aaggagctct 
3780 

SO 

ctttacgaga aaaggaagat cctggctggt cagcatgagg atgcccggga qctgaaggag 
3840 

aacctggatc gcagggagcg agtagtgctg qgcatcttgg ccaattacct ttcagaggag 
3900 

55 

cagctccagg actaccaqca cttcgtgaaa atgaagtcca cgctcctcat tgagcaacgg 
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3960 

aagctggatg acaagatcaa qctgggccag gagcaggtcja agtgtctgct ggagagcctg 

A020 

cccccagacc tcattcccaa ggctggggcc ctggctctgc ccccaaacct cacgagtgag 
4080 

cccattcctg ctgggggctg tactttcagt ggtattttcc caacattaac ctctccactt 
4140 

taacctcttc taaaataccc aaccaaaaga tcactgtttc tctcaacact atttaatctg 
4200 

aaaaatgttt cagtacaaac cactgtttga actatctggg ttattggtgt ttgttcctga 
4260 

tgaaaggaaa aaaattctct ccaggaggaa gcctttttcc ttcttgccct tcctgattga 
4320 

tcttctgaga gctcgaatgc tgctggacac gtaccccttt ctattattac tttgtagtag 
4380 

aaagaaagtt aatgaaactg agaactgatt ggagggtgtt Ugatcattta gtttttaaca 
4440 

ggctgaggca acatggatca gtgtgtgtcc ccctcaggaa tgtatccaca gtggccttcc 
4500 

ttgctggUgg gcagtgtatc ctgatggcag ggtacaagta ccattaatga agggtctgca 
4560 

acataaagcc ttaaaaagac acacactaag aaaactgtaa aaccttgaac attgttattt 
4620 

atatttttta aaatggaaaa gatcactatg ttlgttgtgc taaccactta tttgattctg 
4680 

ttttgtggtg gacatagatg attacgtttg agctttgtat tttgtgaaaa ccttaatgaa 
4740 

atgaattcca aagat 
4755 



<210> 7 

<211> 2045 

<212> DMA 

<213> human 

<400> 7 

gaaacttgac cccggctcat cctgtctctg gctgtggccc ggcaaagcac tgaaaacccc 
60 

tctggtctca gagacagtag gggcagtgcc actttctaca acctgccaac ccacacactg 
120 

gagtaattct gaaaaaaatt attcctaaac tctctaagtg tggacggaga atgagcaagc 
180 

cccagaagta ttttacaacc agagtgggta atgaggaggg ggcttactgg aaCcgtcata 

240 

tctctgaata ttgaaaacaa caactaaaaa agtggacctt ctcagaaaaa aagggcagca 
300 
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aatgaccaaq ggcgcccctt ctggccgtgc ttggcttgag taactgtctc tctttcccca 
360 

cccccatcac agggctttca gtttggcaaa ggaaaagcag ataaaaacag aacattccat 
420 

atgtttcttt ctccatcggc caaaaacatt ttgacacaat gtttgtgaaa cacctttgga 
480 

gaggtgcact tctgaatgct gcctctgccg taaatcctgg ggoaagggat cagcctcttc 
540 

ccaggaacca tcgccttcta taaaccgtga actcaagcag gcattttttt tttcttaccg 
600 

aaaggctgct attgtgcaag ggcacataat gggtctgttg ctcttactgg cttcjcaaatg 
660 

tgcotggcaa agagagagat gtgggcctag agcagaLata ttcagcaagg tgacagcttc 
720 

ccataacaat tctaacactt cttatcttat gtgagaataa aatatttaag ggttgaacct 
780 

tatttlgcca aatgtatctt ttctgctttt gaattgggca gaagatttta gcaactatat 
840 

tctacaaatg ttacttataa cacacacaca cacatctgaa atatatgccg aaaattgacg 
900 

tctttgacct cagggagagc acctgtccag gLctgcctaa aggaaatggc tccagtgggt 

960 

ctaaacaacc acatcctatc catggatagg tctagtcata acactttago gagaatgtca 
1020 

gagcaggagg gaggcaagcc gcctcttctc ggccatcaac tgcagatgat gaaagagcgg 

1080 

gattcaactt tgttttcttt tcctgtggcc ccagtgaaac ctcctgccct ccctgcacgt 
1140 

ctgtgtcttc atttctaaaa tgggggtgat gctttcatat tgacctcacc ccatactacc 

1200 

tcacagatgt gttgtgagga ttaataaaat tatgtctatg gtattttcag tttctggaga 
1260 

aaaatactta tagacagttt aactattaca tagatatata agtgatctca gtttcttgtt 

1320 

tgctgtgata ctaatgtgtt gttttaactt attccataaa atgacagttg tgLcctagcc 
1380 

acatcagaca gctatctaag ctctggacta cccctttgtg cagctgaatc actgcagggt 
1440 

cgaccatgcc tggtgccaca gccatggttt ccatttctag at.gaaaggat ggcctaggac 
1500 

ataggtctca aagactcttg gatcagaatc aggagattag ggaaaacagg atggatacct 
1560 

gagcactaac agcagtagac gtagacctct gtcctttacc atctgaggtc ttctggattc 
1620 
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tttgtggggt taattttgat ctqatgtcat ctgLttgccc ttcatcttgc ttgcaagtgt 
1600 

gcatggttca atcccLcaca tccaggaaat gaattttgca attgggccag atgctaattt 
1740 

gcacgttgat tcaccttctt tgcctttaag cctttttttt cttttttttt ttttttggca 
1800 

aatgaatgta ccatttcaac tttgatttta atagtgctag ttgatattgg taataatgct 
1860 

aaccaagaga tcaatgccag atttttctct tggggtaagt tagctgaagt catttaaaga 
1920 

tggaaaggtg ggaaaattct r.tgatatttg atgtcattgt atccacattt gttgtaagac 
1980 

atattgcata ccaattataa ttatatcaat taaagttgat aaaagcttca aaaaaaaaaa 
2040 

aaaaa 
2045 



<210> 8 

<211> 2096 

<212> DNA 

<213> human 

<400> 8 

atggagaacg agcctgtagc ccttgaggaa actcagaaga cagatcctgc tatggaacca 
60 

cggttcaaag tggtggattg ggacaaggac ctggtggact ggcgaaagcc tcLcctgtgg 
120 

caggtgggcc acttgggaga gaagtacgat gaglgggttc accagccggt gaccaggccc 
180 

atccgcctct tccactcaga cctcattgag ggcctctcta agactgtctg gtacagtgtc 
240 

cccatcatct gggtgcccct ggtgctgtat ctcagctggt cctactaccg aacctttgcc 
300 

cagggcaacg tccgactctt cacgtcattt acaacagagt acacggtggc agtgcccaag 
360 

tccatgttcc ccgggctctt catgctgggg acattcctct ggagcctcat cgagtacctc 
420 

atccaccgct tcctgttcca catgaagccc cccagcgaca gctattacct catcatgctg 
480 

cacttcgtca tgcacggcca gcaccacaag gcacccttcg acggctcccg cctggtcttc 
540 

ccccctgtgc cagcctccct ggtgatcggc gtcttctact tgtgcatgca gctcatcctg 
600 

cccgaggcag tagggggcac tgtgtttgcg gggggcctcc tgggctacgt cctctatgac 
660 

atgacccatt actacctgca ctttggctcg ccgcacaagg gctcctacct gtacagcctg 
720 
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aaggcccacc acgtcaagca 
780 

aaattgtggg attactgttt 

840 

tgacaactcc caccccctcc 
900 

acccgccatt cagaccccat 
960 

gccctgcagc ctagtggaag 
1020 

ctggtaggag ggtcacatcc 

1080 

agcgtccctg cctagagctc 
1140 

ccgcaagggg aaatgaagaa 
1200 

agcctcatgg gctgcctagg 
1260 

gggtctgtgg gatctgtggt 

1320 

gctctgaggt ggacaaagag 
1380 

ccccagaccc tggggcagcc 
1440 

tggctggcag gagtcccagc 
1500 

ccctgcctcc agaatcacag 
1560 

tggcagcccc accccgtccc 
1620 



ggcgctccca gaaggagcca 
1680 



agatgaggtt cctgcccctt 
1740 

ttatttataa accagataca 
1800 

gccctttcta atcctacatg 
1860 

ttcctttctt acccacaaac 
1920 

gctaaagaga agcagtttga 
1980 

aagattttgg aataaatata 
2040 



ccoctttgca catcagaagt 
ccacaccctc actccagaga 
gtcctgccct cagcccggcc 
taagaaggtt ggcttggcca 
gtgctgagqg ggccctgagg 
acttggtgca ggtggccctt 
agcccacagg actgcttcag 
aactgagccc tcgtggccac 
agccgcctgc acggcacagc 
ccctgtcctc cctgctgtcc 
ctctcgcaag aagagacagc 
cctctggccc tgccagctgc 
Cgcttgcttt aggacctggc 
cccttctccc caagggaggc 
tggccattct tggcctccac 
cctctcagtg cctcacctcc 
cctcctcgta accaaaaccc 
tgttcttagt ctggtcccag 
ttgagcttat gtaaaaaatg 
cattactact tgaaacttaa 
cggaccttgt gatttgtact 
caaaactacg gttgtgaaat 



caggatttgg tatcagcact 
aaccccacct gaagacgcag 
ctggcccctt cccgaccccc 
ggcaggatgg gctglgtccg 
caggaccgcc ctcctgaccc 
ggtgacccac ttcttcctgg 
gccgtggcca caggtagcag 
ctgtgtcacc cttgtgcctt 
tcgctttcac agtcagaagt 
cttctgggga ggctttggtg 
gtgatgcctc ccacagtcca 
ctgcgtcgtt gggcccaggg 
agcttttctt gccgtccctc 
tgaggaggct tctccaccag 
cccgctcagg cccctacLcg 
ccctgcctcc cagcctccgc 
tcactgctcc caggacggtc 
accaaggagc tggtcagacg 
ttgtttcctc ctgtttttgg 
aaaactcgcc aagtgtaaag 
gtttgctgcg gagctattta 
aaaaactt'aa attgtatatt 
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ttgaaaaato .^aacactgaa aagaaaccaa caaaaaaaaa aaaaaaaaaa aaaaaa 
2096 



<210> 9 

<211> 5640 

<212> DNA 

<213> h'jrnan 

<4Q0> 9 

ggaaacgcag aaaacagaga gaggcattct gagtcatctg actggatgaa gactgttcca 
60 

agttacaacc aaacaaatag ctccatggac tttagaaatt atatgatgag agatgagact 

120 

ctggaaccac tgcccaaaaa cUgggaaatg gcctacactg acacagggat gatctacttc 
180 

attgaccaca ataccaagac aaccacctgg ttggatcctc gtctttgtaa gaaagccaaa 

240 

gcccct.gaag actgtgaaga tggagagctt ccttatggct gggagaaaat agaggacccU 
300 

cagtatggga catactatgt tgatcacctt aaccagaaaa cccagtttga aaatccagtg 
360 

gaggaagcca aaaggaaaaa gcagttagga caggttgaaa ttgggtcttc aaaaccagat 
420 

aUggaaaaat cacacttcac aagagatcca tcccagctta aaggtgtcct tgttcgagca 

480 

tcactgaaaa aaagcacaat gggatttggt tttactatta ttggtggaga tagacctgat 
540 

gagttcctac aagtgaaaaa tgtgctgaaa gatggtcccg cagctcagga tgggaaaatt 
600 

gcaccaggcg atgttattgt agacatcaat ggcaactgtg tcctcggtca cactcatgca 
660 

gatgttgtcc agatgtttca attggtacct gtcaatcagt atgtaaacct cactttatgt 

720 

cgtggttatc cacttcctga tgacagtgaa gatcctgttg tggacattgt tgctgctacc 
780 

cctgtcatca atggacagtc attaaccaag ggagagactt gcatgaatcc tcaggatttt 
840 

aagccaggag caatggttct ggagcagaat ggaaaatcgg gacacacttt gactggtgat 
900 

ggtctcaatg gaccatcaga tgcaagtgag cagagagtat ccatggcatc gtcagqcagc 
960 

tcccagcctg aactagtgac tatccctttg attaagggcc ctaaaqggtt tgggtutyca 
1020 

attgctgaca gccctactgg acagaaggtg aaaatgatac tggatagtca gtggtgtcaa 
1080 

ggccttcaga aaggagatat aatLaaggaa atataccatc aaaatgtgca qaatttaaca 
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1140 

catctccaag tggtagaggt gctaaagcag tttccagtag gtgcLgatgt accattgctt 

1200 

atcctaagag gaqgtcctcc LLcaccaacc daaactgcca aaatgaaaac agataaaaag 
1260 

gaaaaLgcag gaagtttgga ggccataaat gagcctattc ctcagcctat gccttttcca 
1320 

ccgagcatta tcaggtcagg atccccaaaa ttggatcctt ctgaggtcta cctgaaatct 
1380 

aagactttat atgaagataa accaccaaac accaaagatt tggatgtttt tcttcgaaaa 
1440 

caagagtcag ggtttggclt cagggtgcta ggaggagatg gacctgacca gtctatatat 
1500 

attggggcta ttattcccct gggagcagct gagaaagatg gtcggctccg cgcagctgat 
1560 

gaactaatgt gcattgatgg aattcctgtt aaagggaaat cacacaaaca agtcttggac 
1620 

ctcatgacaa ctgctgctcg aaatggccat gtgttactaa ctgtcagacg gaagatcttc 

1680 

tatggagaaa aacaacccga ggacgacagc tctcaggcct tcatttcaac acagaatgga 
1740 

tctccccgcc tgaaccgggc agaggtccca gccoggcctg caccccagga gccctatgat 
1800 

gttgtcttgc aacgaaaaga aaatgaagga tttggctttg tcatcclcac ctccaaaaac 
1860 

aaaccacctc caggagttat tcctcataaa attggccgag tcatagaagg aagtccggct 

1920 

gaccgctgtg gaaaactgaa agttggagat catatctctg cagtgaatgg gcagtccatt 
1980 

gttgaactgt ctcatgataa cattgttcag ctgatcaaag atgctggtgt caccgtcaca 

2040 

ctaacggtca ttgctgaaga agagcatcat ggtccaccat caggaacaaa ctcagccagg 
2100 

caaagcccag ccctgcagca caggcccatg ggacagtcac aggccaacca catacctggg 

2160 

gacagaagtg ccctagaagg tgaaattgga aaagatgtct ccacttctta cagacattct 
2220 

tggtcagacc acaagcacct tgcacagcct gacaccgcag taatttcagt tgtaggcagt 
2280 

r.ggcacaatc agaaccttgg ttgttatcca gtagagctgg agagaggccc ccggggcttt 
2340 

ggattcagcc tccgaggggg gaaggagtac aacatggggc tgLtcatcct tcgtcttgct 
2400 

gaagatggtc ctgccatcaa agatggcaga attcatgttg gtgaccagat tgttgaaatc 



36 



EP 1 355 149 A2 



24 60 

aatggggaac ctacacaagg aatcacacat actcgagcaa ttgagctcat tcaggctggt 

2520 

ggaaataaag ttcctcttct tttgaqqcca ggaactggcL tgatacctga ccatggtttg 
2580 

gctccttccg gtctgtgctc ctacgtgaaa cccgagcaac attaaggctt tcagggcttt 

2640 

tcttggtctt tccttaaaaa gacttggtga ttgggatatt aataatcctt cgtcttcaaa 
2700 

tgtgatttat gatgaacagt caccattacc cccatcttca cattttgctt ccatatttga 

2760 

agagtctcac qtgccagtaa ttgaagaatc ttLgagagtt cagatatgtg aaaaggcaga 
2820 

agaattaaag gacattgtgc ctgaaaagaa aagcacttta aatgaaaatc agcctgagat 
2880 

aaagcatcag tctcttctcc agaaaaatgt gagtaagagg gatccaccca gcagtcatgg 
2940 

gcacagtaac aagaaaaatc tattaaaagt agaaaatggt gttacacgaa gaggLagatc 

3000 

ggttagtccc aaaaagccag ccagtcaaca Ltcagaggaa catttggata agattcctag 
3060 

tcctctaaaa aataacccca aaagaagacc cagagatcaa tccctcagcc ccagcaaagg 

3120 

ggaaaataaa agttgtcagg tcagcaccag ggcaggctct ggacaagatc agtgcagaaa 
3180 

aagcagaggt cggtcggcca gcccaaaaaa gcagcaaaaa aLLgaaggaa gcaaagctcc 
3240 

atcaaatgct gaggccaaat tattagaggg taagagtcga agaatagcag gctatacggg 
3300 

cagtaatgct gagcagatcc cagatgggaa ggaaaaatca gacgtcatca ggaaagatgc 

3360 

aaagcagaat cagttggaaa aaagcagaac aaggtctcca gagaaaaaaa tcaaaagaat 
3420 

ggttgagaaa tctcttccat ccaaaatgac taataagact acaagtaaag aagtatctga 
3480 

aaatgaaaaa ggaaagaaag taaccacagg agaaacaagt tctagtaacg ataaaatagg 
3540 

agaaaatgtc cagctatcag aaaagaggct gaagcaagaa cctgaagaga aggtagtttc 
3600 

aaacaaaaca gaagatcaca aagggaaaga actagaggca qctgacaaaa acaaagagac 
3660 

tggaaggttc aaaccggaaa gcagttctcc agttaagaaa acactgataa ctccagggcc 
3720 

ctggaaggtt ccaagtggaa ataaagtcac aggcactaLt ggtatggctg agaaacggca 
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3780 

gtaaccttta gtataaaaca aagaaaaaca agttgtaatc ttttctcaca gcagcatttt 
3840 

tccagaaaaa gccttttttC cctuttcaga tatLcLyaaa cayacaagta caLgLLaatg 
3900 

tgagcctcaa gtlacctagg ctgcatgaag ggcctttagg attgctaaga accaactgtc 
3960 

cccctggccg gctgccctcc ctcgctctca ggaaggagct gcatccacat gctcatctga 
4020 

cccgccctgc tcaggctgcc cagctcgtct tcatgagtgt ctgaacaaat gacatatgtt 
4080 

gatattaaca atgtggtcac aactcacttt gtatttgtgc caagttatct actgtatcat 
4140 

gtctgttttt atcctttttg ttcagctgtt tccacagtaa tgaaaaagtt aggtttggct 

4200 

tggaagttga tgatctcaat agcatgttgc atgtttacag agagaaatat gtgagtcctt 
4260 

gcagaagaag agactgttaa ctcatcgtta aagatggccg ttgtctcttc taacagctac 
4320 

tgatgatgtc ccactttaaa aataaaaccc ccaaacatca ctactttaag gaaaaaaaaa 
4380 

atgtagtcca atattgatgc tttcttatgg ctttttattt taatttggct ggataagttg 
4440 

tttcaaataa ctgttaaaga tattacttac aattgaatgt ttgaaataag aaagtacttt 
4500 

aagcaataga gttcatctcc tgctgtgtta tccaacctcg atgtatactt acagcatctc 
4560 

aggtcaccct Ltttatttca gttatttaat tatgaaacca taaagaagca tgtggaaata 
4 620 

gtgtttattg ctctttgaag aaaaaccacc aactatttct ggatattttg gctgtaccta 

4 680 

ctactaaagt cattagtctt taatacataa tacatatttg aaaagtaaac atattatata 
4740 

gattatgtga gggacttaat catgaaacca gtttcacagt ccaagtacca actcttctgg 
4800 

tagcaggtgc acaagcttgg gtgtttaaaa acaacctgtg tagggtatgc ccagcaaatg 
4860 

aggacaaatg tgtagacagt acttactgga tcttatttaa cttttagcta cattaactaa 
4920 

ctttcttatt taaaaacaag aaagggagac taaacatctg cttaacttgt acacattttc 
4980 

agaattcttt ttaaaagtct agttaaaqat gtttcttaga agttggagac tgttaacaac 

5040 

ttccaUaaaa tagatccagg tttttcagtt ccctgaagca gcattcagta gcatctatat 
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5100 

aaataaaggc accttctgag aataaaacta ttttatggag tgtgtqaaca cacttgttct 
5160 

gtcacccggg ttcaccttgt tgtqaaqcac attaggtcca ggtcctLccc LcLyyyaytc 
5220 

tgactgtgaa actcLLtaac ccaacaactc aattagcccc tgtagataag acatgcttcc 
5280 

cagagtgaga tttttgaaat ccccttttca tccagaacta tatttaccca cctattgtao 
5340 

ctattcaaat ogagcaaaat taggaggctt gataaatact aagaatttag taccacagaa 
5400 

attatttatt attttccctg tagtccacaa ttagtgataa cgaatcctat ttttgttaac 
5460 

tgtgacataa ctttgatgtc atatgttgtc ctatgtggtt cttcctaagt aaactctgta 

5520 

ctgattatat actgacttag caatgtggcc ttggaatgct gagcaaaatg tggatgtact 
5580 

ggttgtaaat gtttatatat tgtacagtac ctttatatat acacttgagg ttctgattag 
5640 



<210> 10 

<211> 457 

<212> DNA 

<213> human 

<220> 

<221> misc_feature 

<222> (242) . . (242) 

<223> any kind of base 



<220> 

<221> misc^feature 

<222> (369) . . (369) 

<223> any kind of base 



<220> 

<221> mi sc_f eature 

<222> (394) - . (394) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (406) . . (406) 

<223> any kind of base 



<220> 

<221> misc^feature 

<222> (457) . . (457) 

<223> any kind of base 



<400> 10 
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tcaqtcactc uttcaccctg ccaaagcttc actyLccLac tgattgaatt gtatgtgaga 

60 

aataaaatgt catcatatta agr.ractggg atttgtatgt ttatctgtta tagcaqcaag 
120 

tcttaattta cctaatacac acattgtgac agatgttctt aatgtcccac cccatattgt 
180 

tacatgtcca gctttgagga tccctggcat gtgggggtag gagtttctgg gcatgctgga 
240 

tncaattccc acttttaagg catctgtggc ctctgtggcc tctgtggcct tcactgttat 
300 

ggaagggatt tatctggggc accataggaa actttaccat ggcacagtgg acaacctagg 
360 

agggggtgng gaggaggggc cttcaggccc aacngggggg accagngttc gtggggttag 
420 

ggtggtttgg ggggttttcc ctcttacccg tgggggn 
457 



<210> 11 

<211> 1493 

<212> DNA 

<213> human 

<400> 11 

aatagggttg gcggctgcag cgggcggcaa acagcccgcc cggcaccacc atgctcgccc 
60 

tggaggctgc acagctcgac gggccacact tcagctgtct gtacccagat ggcgtcttct 
120 

atgacctgga cagctgcaag cattccagct accctgattc agagggggct cctgactccc 
180 

tgtgggactg gactgtggcc ccacctgtcc cagccacccc ctatgaagcc ttcgacccgg 
240 

cagcagccgc ttttagccac ccccaggctg cccagctctg ctacgaaccc cccacctaca 
300 

gccctgcagg gaacctcgaa ctggccccca gcctggaggc cccggggcct ggcctccccg 
360 

cataccccac ggagaacttc gctagccaga ccctggttcc cccggcatat gccccgtacc 
420 

ccagccctgt gctatcagag gaggaagact taccgttgga cagccctgcc ctggaggtct 
480 

cggacagcga gtcggatgag gccctcgtgg ctggccccga ggggaaggga tccgaggcag 
540 

ggactcgcaa gaagctgcgc ctgtaccagt tcctgctggg gctactgacg cgcggggaca 

600 

tgcgtgagtg cgtgtggtgg gtggagccag gcgccggcgt cttccagttc tcctccaagc 
660 

acaaggaact cctggcgcgc cgctggggcc agcagaaggg gaaccgcaag cgcatgacct 
720 
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accagaagct ggcgcgcgcc ctccgaciact acgccaagac cggcgagatc cgcaaggtca 
780 

agcgcaagct cacctaccag ttcgacagcg cgctgctgcc tgcagtccgc cgggcctgag 
840 

cacacccgag gctcccacct gcggagccgc tgggggacct cacgtcccag ccaggatccc 
900 

cctggaagaa aaagggcgtc cccacactct aggtgatagg acttacgcat ccccaccttt 

960 

tggggtaagg ggagtgctgc cctgccataa tccccaagcc cagcccgggc ctgtctggga 
1020 

ttccccactt gtgcctgggg tccctctggg atttctttgt catgtacaga ctccctggga 
1080 

tccLcatgtt ttgggtgaca ggacctatgg accactatac tcggggaggc agggtagcag 
1140 

tgcttccaga gtcccaagag cttctctggg attttcttgt gatatctgat tccccagtga 
1200 

ggcctgggac ctttttaaga tcgctgtgtg tctgtaaacc ctgaatctca tctggggtgg 
1260 

gggccctgct ggcaaccctg agccctgtcc aaggttccct cttgtcagat ctgagatttc 
1320 

ctagttatgt ctggggccct ctgggagctg ttatcatctc agatctcttc gcccatctat 
1380 

ggctgtgttg tcacatctgt cccctcattt ttgagatccc ccaattctct ggaactattc 
1440 

tgctgcccct ttttatgtgt ctggagttcc ccaatcacat ctagggctcc tec 
1493 



<210> 12 

<211> 2292 

<212> DNA 

<213> human 

<400> 12 

ccatgggttc cccttcagcc tgtccataca gagtgtgcat tccctggcag gggctcctgc 
60 

tcacagcctc gcttttaacc ttctggaacc tgccaaacag tgcccagacc aatattgatg 

120 

tcgtgccgtt caatgtcgca gaagggaagg aggtccttct agtagtccat aatgagtccc 
180 

agaatcttta tggctacaac tggtacaaag gggaaagggt gcatgccaac tatcgaatta 
240 

taggatatgt aaaaaatata agtcaagaaa atgccccagg gcccgcacac aacggtcgag 
300 

agacaatata ccccaatgga accctgctga tccagaacgt tacccacaat gacqcaggat 
360 

tctataccct acacgttata aaagaaaatc ttqtqaatga agaagtaacc agacaattct 
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420 

acgtattctc g(]a9ccaccc 
480 

acaaagatat tgtggtttta 
540 

gggtaaacaa tcagagcctc 
600 

ccctcgttct actcagcgcc 
660 



acccagtggg tgccagccgc 
720 

aagcaagttc acctgacctc 
780 

ctgggatggc tctgatatag 
840 

tattatccac ctgcagactg 
900 

tggratcact aagtataaga 
960 

actcaatgta aatttcaagg 
1020 

ccaaaatgtt caacaccata 
1080 

tgacttcatg ctgtggacag 
1140 

ttatcccact ccatttttcc 
1200 



taactagaat ttcacaatca 
1260 



atgtcatcat gtcaaaccca 
1320 

aactttaaca acatccctaa 
1380 

ttagacctct agactcacct 
1440 



ccagataaca gaattgctgc 
1500 



cttcttgttg cacataaata 
1560 



caaaLaLgct gcttgattaa 
1620 



ctcttttatt tggtttggtt 
1680 



tgtgatcctc ctgattgtca 



aagccctcca tcaccagcaa 
acctgtcaac ctgagactca 
ccggtcagtc ccaggctgct 
acaaagaatg acataggacc 
agtgacccag tcaccctgaa 
tcagctggga ccgccgccag 
cagccttggt gtagtttctg 
gactggattc ttctagctcc 
cctgctctct tcctgaagac 
aaaaaccctc atgcctgaga 
actagagaca cLcaaattgc 
tttttcccaa gatgtcccaa 
ctgctcatgc ctgcctcttt 
gcgccttgtg caggcaattt 
aatatttgac ctaagggatc 
tacaactgtt tattcaaatg 
gttctcacgc cctgttttaa 
ctacgagctg aacagggagg 
aatacagtgg gtactataga 
aatgggtagg cttctcatgt 
catggggtct ctgcctotgg 
caatattagt taccctggtg 



caacttcaat ccggtggaga 
gaacacaacc tacctgtggt 
gctctccact gacaacagga 
ctatgaatgt gaaatacaga 
tgtccgctat gagtcagtac 
catcatgatt ggagtactgg 
catttcggga agagtgtttt 
ttcaatccca ttttctcctg 
ctataagctg gaggtggaca 
tgtgggccac tcagagctaa 
caaccaggac aagaagttga 
gcctcatcgt gacgaggctc 
aatttggtaa gataatgctg 
gacagagtgt tggatgtgtc 
ctttattctg cccagt-.ggct 
cacggtggtc cctgttagag 
tttaacccag ctatgggatg 
agtttgtgca gttgctgaca 
gacLcagtUg caaaaattaa 
ggctcattct ttaatctatt 
atcatacttc aaactcttgg 
tgctgtattc tctaaaacct 
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10 



IS 



20 



25 



45 



50 



1740 

ttaaatgttt gcatgcagcc attcgtcaaa tgtcaaatat LcUctctttg gctggaatga 
1800 

cdaaaacLcd aataaatgta tgattaggag gacatcataa cctatgaatg atggaagtcc 
1860 

aaaatgatgg taactgacag tagtgttaat gccttatgtt tagtcaaact ctcatttagg 
1920 

tgacagcctg gtgactccag aatggagcca gtcatgctaa atgccatata ctcacactga 
1980 

aacatgagga agcaggtaga tcccagaaca gacaaaactt tcctaaaaac atgagagtcc 

2040 

aggctgtctg agtcagcaca gtaagaaagt cctttctgct ttaactctta gaaaaaagta 
2100 

atatgaagta ttctgaaatt aaccaatcag tttatttaaa tcaatttatt tatattcttc 

2160 

tgttcctgga ttcccatttt acaaaaccca ctgtcctact gttgtattgc ccagtaggag 
2220 

ctatcactat atkLtgcaga atggaaactg ccctgactct tgaatcacaa at.aaaagcca 
2280 

attgtatctg tt 
2292 



<210> 13 

30 <211> 519 

<212> DNA 

<213> human 

<220> 

<221> misc_feature 

35 <222> (212).. (212) 

<223> any kind of base 



<220> 

<221> misc_feature 

40 <222> (451).. (451) 

<223> any kind of base 



55 



<400> 13 

gaaacaacaa cagtgtaatc tttaacaggg atgttaaagg taagaagtca ggaagataaa 
60 

ccaaaatgat tgagtatgat aaagaatttt gcatggcgat taaaatagaa aacctataaa 

120 

tgtagaaaaa gcaggtctgg acttagcaaa gaaacaatat agtttggaga aggcatgaaa 
180 

taagttcttt tcatgttcac tgctggtcac ancataacag agagtgatgt ggagagcttt 
240 

gggaaggttt cacgttgagt tacatcagtg gtcaacaatg gagcaacaag actccgtaga 
300 
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ggatgccacc ctgggagaeiL igcctagggaa aggaggctga agcacaactg gtaatagcct 
360 

tcagatattt: aatggatatg caaataaagc tctgattaat tgtattttca cttattatat 
420 

atcatctttg gacctttcta aaagtgggac nctagaaaag atatactgaa actccaaaag 
480 

aatacttcag ctcgagttga atggattcaa gatgttgtt 
519 



<210> 14 

<211> 5294 

<212> DNA 

<213> human 

<400> 14 

ggctcgcatc cccatagtgc tgggttacag tgaaggtacg ccccgcgctc tgctctggag 
60 

aggcagggtg ggatagggaa cgtctcgagt ggcgcccgca gtcatggtgg tgttcgttgg 

120 

ccgccgccLc ccggcgctcc tagggctgtt taagaagaag ggctctgcca aggctgagaa 
180 

tgacaaacat ctaagtgtag ggcctggcca ggggccaggg tctgcagtgg atgagcacca 
240 

ggacaacgtc ttctttccca gtgggcgacc cccccacctg gaagagctqc acactcaggc 
300 

ccaggagggg ctccgctccc tacaacacca agagaaacag aaactgaaca agggtggctg 
360 

ggaccatgga gacacccaga gtatccagtc ctcccggacg gggccggatg aagacaacat 
420 

ctccttctgc agtcagacca catcctacgt ggctgagagc tccacagcag aggacgcgct 
480 

ctccatccgc tcggagatga tccagcgcaa aggctccacc ttccgacccc atgactcatt 
540 

tcccaaatct ggaaagtcag ggcggcgtcg gcgggagcgg cggagcactg tgctgggact 
600 

cccgcagcat gtgcagaagg agcttggcct gaggaatgag cgtgaggcac caggcacgcc 
660 



ccgggctcct ggtgcacggg atgccgtacg catccccaca gtggacggcc gcccccgagg 
720 



cacctcaggg atgggggccc gggtgtccct gcaggcgctg gaggcggagg cagaggctgg 
780 

cgctgagaca gaggccatgc tgcagcgcca cattgaccgt gtctaccggg atgacacctt 
840 

tgttggccgg tccacgggta cccgggcccc accattgacc cggcccatgt ccctagcagt 
900 



gcctggattg acaggagggg cagggcctgc agagcccctg agcccggcca tgtccatctc 
9G0 
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cccccaggcc acctacctgt cgaagttgat tccacatgct gtgctgccgc ctacagtgga 
1020 

cgtggtggcc ctaggccgct gcagcctgcg cacactaagc cgctgcagcc tgcactcggc 
1080 

cagcccagcc tcagtccgct cgctggggcg cttctcctcc gtctccagcc cacagccccg 
1140 

cagccgccac ccatcctcct ccagtgacac ctggagccac tctcaatcct ccgacaccat 
1200 

tgtgtctgac ggttccaccc tctcctctaa gggtggctct gagggccagc cggagagctc 
1260 

tacggctagc aatagcgtgg taccccctcc ccagggaggc agtqggaggg gctctcccag 
1320 

tgggggcagc actgctgagg cctcagacac actcagcatt cggagcagtg ggcagttgtc 
1380 

tggccggagt gtgtccctgc gtaagctgaa gcggcctcca ccccctcccc gccggaccca 
1440 

ctccctccat cagcggggct tagcagtgcc tgatgggcca ttagggttgc cccctaagcc 
1500 

tgagcgtaag cagcagcccc agctgcctcg gccacccacc actggtggct cagaaggggc 
1560 

gggggcagca ccctgtccac ccaacccagc caacagctgg gtacctggct tgtctccggg 
1620 

tggttcccgg cgccccccac ggtccccaga acggacactt tcgccctcca gtggatactc 

1680 

gagccaaaqt ggtactccca ccctccctcc caagggcctg gcaggtcccc ctgcttcccc 
1740 

aggcaaggcc cagcccccta aaccagagcg tgtcacgtct cttcgctccc ctggggcctc 
1800 

cgtctcctct tccctcacgt ctttatgttc ctcctcctct gacccagccc cctcagaccg 
1860 

ctctgggcca cagatattga cccccctggg tgacaggttt gtcatacctc ctcaccccaa 

1920 

ggtgcctgcc cccttctccc cacctccctc caagcccagg agccctaacc cagctgcccc 
1980 

tgctcLagcc gcccctgctg tggttcctgg gcctgtttct accactgacg ccagtcctca 
2040 

gtcccctccc actccccaga caaccttgac tccactgcag gagtctcctg tcatctccaa 
2100 

agaccagtca cccccacctt ccccaccccc atcttatcat ccacccccac cacccactaa 
2160 

gaagccagag gtggttgtgg aggcaccatc tgcctcagag actgctgagg agcccctcca 
2220 

agatcccaac tggccccctc ccccaccccc tgcccctgag gagcaggacc tgtccatggc 
2280 
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10 



15 



20 



25 



tgacttcccc ccaccagagg aggctttttt ctctgLggcc agccctgagc ctgcaggccc 
2340 

ttcaggctcc ccagagcttg tcagctcccc ggctgcttcg tcctcctcag ctactgcttt 
2400 

gcagattcag cccccgggta gcccagaccc tcctccagct ccgccagccc cagctcctgc 
24 60 

tagttccgcc ccagggcatg tggccaagct ccctcagaag gaaccggtgg gctgtagcaa 

2520 

gggtggtggg cctcccaggg aggacgtagg tgcgcccctg gtcacgccct cgctcctgca 
2580 

gatggtgcgg ctgcgctccg tgggtgctcc aggaggggct cccaccccag cactggggcc 
2640 

atcggccccc cagaaaccac tgcgaagggc cctgtcaggg cgggccagcc cagtgcctgc 
2700 

cccctcctca gggctccatg ctgcggtccg actcaaggcc tgcagcctgg ccgccagtga 
2760 

aggcctctca agtgctcagc ccaacggacc gcctgaggca gagccacggc ctccccagtc 
2820 

ccctgcctca acggccagtt tcatcttctc caagggctct aggaagctgc agctggagcg 
2880 

gcccgtgtcc cctgagaccc aggctgacct ccagcggaat ctggtggcag aactccggag 
2940 

catctcagag cagcggccac cccaggcccc aaagaagtca cctaaggctc ccccacctgt 
3000 

ggcccgcaag ccgtctgtgg gagtcccccc acccgcctcc cccagttacc ctcgagctga 
3060 

^5 gccccttact gctcctccca ccaatgggct ccctcacacc caggacagga ctaagaggga 

3120 

gctggcggag aatggaggtg tcctgcagct ggtgggccca gaggagaaga tgggcctccc 
3180 

^0 gggctcagac tcacagaaag agctggcctg accaccaggc acctcactgg cactgctgac 

3240 

ccatcccaga aacacaatct cagggacccg agcagctcca aggacgagag gatacagcag 
3300 

45 acacaaccta atagagaggg cgcctgcagc cttaacctcc acggccttcg atacttatgc 

3360 

aagcctggtg ttgctcctgt cctcagagtc atcctgcgct catgcctttt cccgaatggg 
3420 

50 ttcacctctg gcagtCgccg cttcagtctt ggccttagcc tcatcttgaa gtggglagct 

3480 

ggcgggagag ggtggctgcg ccccctgctg gccctgaggc tgcagagttg ggagcaggac 
3540 

55 acctcacctg agtttcattt tttttcatgt ccaaaccatg cacatactat agtccagaat 

3600 
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caaagcactt ttgaaaagtg gctgcatggc 
3660 



aagggcctgt ttacatggca gcagaatcca 
3720 



agtctgtgcc cCcctgccca gtccagttta 
3780 



attgtgttcc cacaggcLtc tctaggctgg 
3840 



cacaaaaggt gcagagggga ttggccttcc 
3900 



cttccagttc tgccaggtgc tccatgctgg 
3960 



aaatgggtga gcagtagagt catctcgggg 
4020 



cctccttgac cacactgggg tgggtgggcc 
4080 



ctgagcaagc actgaggagg tggatggaag 
4140 



gcagtgggcc caggcctggc cctccacact 
4200 



cggtgcatgt cctttctgca gctgcctttc 
4260 



cgctgagtga caaggatggg aagccacagg 

4320 



gaggggcacc cagtgcttct agggcaggct 
4380 

ttactgtact ctccgggaat gttaaccttt 
4440 



aagctggctt ccccattggc ccctgtgggt 
4500 



ccgcttcttt cttgatcctc tttccttaac 
4560 



cttgctttta gcttcaccac caaggagaga 
4620 



aggctgggaa cagaggggat gtggtgagag 

4680 



ttccactagt cactactgtc ttctccttgt 
4740 



gtgggcagtg gagagtgctg ctgggtgtac 
4800 



ggataatcag tgagcactgt tctgctcaga 
4860 



aggactgggt caaagctgca tgaaaccagg 
4920 



catcctccag ggcccaggaa gttgcattcc 
tccccggcag tcagcccata gcttgggacc 
ctcctcttgg ttcctgaagg tggccaagtc 
gggcaggtgt ggggctgtgg aattccaaag 
tgtgcctcoa ctcaccaacc accctcctgc 
ggacaagtag gagactgcca gggcccaaag 
cacttggcag tgtcaagcac ctgccccttg 
cccagcactt cagaggcagg agcctttggg 
ggagcatctg gaggggggga gcttccttga 
tcattctctg acctttctct ctcctcattt 
agcacaggtg gttccactgg gggcagctaa 
tgcattttac tcaagtcttc tctagtcaat 
gggtggtggt cccctaggta tcagcctctc 
ctattttcag cctgtgccac ctgtctaggc 
ccacagcagc gtggctgccc cccagggcca 
agtgacttgg gcttgagtct ggcaaggaac 
ggttgacatg acctccccgc cccctcacca 
ccaggttcct ctggccctct ccagggtgtt 
agctaaucaa tcaatattct tcccttgcct 
gctgcacctg cccactgagt tggggaaaga 
gctcctgatc taccccaccc cctaggatcc 
ccctggcagc aacctgggaa tggctggagg 
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tgggagagaa cctgacttct ctttccctct ccctcctcca acattactgg aactctatcc 
4980 

tgttaggatc ttctgagctt gtttccctgc tgggtgggac agaggacaaa ggagaaggga 
5040 

qqgtctagaa gaggcagccc ttctttgtcc tctggggtaa atgagcttga cctagagtaa 
5100 

atggagagac caaaagcctc tgatttttaa Cttccataaa atgttagaag tatatataCa 
5160 

catatatata tttctttaaa tttttgagtc nttgatatgt ctaaaaatcc attcccr.ctg 
5220 

ccctgaagcc tgagtgagac acatgaagaa aactgtgttt catttaaaga tqttaattaa 
5280 

atgattgaaa cttg 
5294 

<210> 15 

<211> 988 

<212> DNA 

<213> human 

25 <400> 15 

gtcgtgaggc gggccttcgg gctggctcgc cgtcggctgc cggggggttg gcctgggtgt 
60 

cattggctct gggaagcggc agcagaggca gggaccactc ggggtctggt gtcggcacag 
120 

30 

ccatggcggg cgcgttggtg cggaaagcgg cggactatgt ccgaagcaag gatttccggg 
180 



10 



15 



20 



35 



40 



45 



50 



55 



actacctcat gagtacgcac ttctggggcc cagtagccaa ctggggtctt cccattgctg 

240 

ccatcaatga tatgaaaaag tctccagaga ttatcagtgg gcggatgaca tttgccctct 
300 

gttgctattc tttgacattc atgagatttg cctacaaggt acagcctcgg aactggcttc 
360 

tgtttgcatg ccacgcaaca aatgaagtag cccagctcat ccagggaggg cggcttatca 
420 

aacacgagat gactaaaacg gcatctgcat aacaatggga aaaggaagaa caaggtcUtg 
480 

aagggacagc attgccagct gctgctgagt cacagatttc attataaata gcctccctaa 
540 

ggaaaataca ctgaatgcta tttttactaa ccattctatt tttatagaaa tagctgagag 
600 

tttctaaacc aactctctgc tgccttacaa gtattaaata ttttacttct ttccataaag 
660 

agtagctcaa aatatgcaat taatttaata atttctgatg atgttttatc tgcagtaata 
720 

tgtatatcat ctattagaat ttacttaatg aaaaactgaa gagaacaaaa tttgtaacca 
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780 



ctagcactta agtactcctg 
B40 



gctggccacg tacttaaaat 
900 



tgcagtotat atattgagat 
960 



aaaaaaaaaa aaaaaaaaaa 
988 



<210> 16 

<;>!!> 4908 

<212> DNA 

<213> human 

<400> 16 

ggataacctc gcagggtggg 
60 

gcccagtgtg tacaatcagt 
120 



gcgctgtgca gcgacgaagc 
180 



ctgcttaagc ccttctcccg 
240 



cttcacgtaa ttaaaaattt 
300 



cctggagcca tccggaagct 
360 



ttagtagcta atgtgattac 
420 



tggtttgagt cttacagaga 
480 



ctgaaccact atttagcatg 
540 



cagttttcaa agttgtcaca 
600 



cccaagtggt ttataccaaa 
660 



ggagatgaac agagagctga 

720 



ggttgctatt tacttaaaat 
780 



ccagatcctt ggagtcagta 
840 



gatggccctt gtactataac 
900 



attcttaaca ttgtctttaa 
tttgtcccca ctgtctaaaa 
gctgtaactt aatggcaata 
aaaaaaaa 

ccggagggcg ggcgccgccg 
gcaggagcta atcccggact 
cgagcggctc actcgtctca 
cctcacttcc gaggttcaca 
gaagatagca gtaagcaaca 
tttgaatgat gttgtttctg 
agcaggagat tatgacctta 
aacctttctt cagtcgatgc 
tatgttggta gcgtcatcta 
agaacagcat cgaattcagc 
tacacttaaa tactatgtac 
atcaatttat gaagaaatga 
taattctcga acatctaatc 
tctcoagaaa aatagtattc 
ttcaaataag aattctgata 



tgaccacaag acaaccaaca 
atgttacccg tgtatttcca 
aatgatttaa atatttgtta 



ctgcctgtgc tgcggcgatg 
ccttcgtccc ctgtgt:cgct 
atcacctcag cttcgcggag 
tgagagatcc taataatcaa 
ttgtcaccca gccacctcag 
gcagtcagcc tgcagaagga 
acatcagtgc cactactcca 
cagcatcgga tcatgaattt 
gtgaagctga acctgtggaa 
acaacagtga ttattcctac 
ttttacatga tgtaagtgca 
aacagaaata tggaactcag 
gagcatcaga tgaacagata 
aaaar.cagga atcatatgaa 
ataacttgct ttcattggat 



49 



EP 1 355 149 A2 



ggattagata acgaagtcaa agatggctta ccaaataact ttagagctca cccacttcag 
960 

ttggagcaat ccagtgaccc ttctaacagt attgatggcc cogatcatct aagatctgct 
^ 1020 

tcatcgttac atgaaacaaa gaaaggaaat actggaataa Itcatggtgc atgtttaaca 
1080 

cttactgatc atgatagaat tcgacagttt atacaagagt tcacatttcg gggccttttg 
1140 

ccacatatag agaaaacaat taggcaatta aacgatcagc taatatcaag aaaaggtttg 
1200 

agtcgatccc tatcttctgc aactaaaaaa tggtttagtg gcagtaaagt tccagaaaag 
1260 

agcattaatg acctgaaaaa tacatctggc ttgctgtatc ctccggaagc accagaactt 
1320 

. caaatcagga aaatggctga cttatgtttt ttggtgcagc attatgatCt ggcttacagt 
2^ 1380 

tgctatcata ctgcaaagaa agattttctt aatgatcaag caatgcttta tgcagctggt 
1440 

gccttggaaa tggcagcagt gtctgctttt cttcaaccag gagcacctag gccatatcct 
25 1500 

gctcattaca tggatacagc aattcagaca tacagagata tctgcaagaa tatggtgttg 
1560 

gctgaaagat gtgtgttgct tagtgctgaa cttttaaaaa gccaaagcaa atattcagag 
30 1620 

gctgcagctc tcctaatacg gttgaccagt gaggattctg atcttcgaag tgcacttctt 
1680 

ttggaacagg cagcacattg ctttataaac atgaaaagtc ccatggttag aaaatatgca 
35 1740 

tttcatatga tattggcagg ccatcgattt agtaaagcag ggcagaaaaa gcatgcttta 
1800 

cgctgttatt gtcaagccat gcaagtttac aaaggaaaag gctggtctct tgcagaggat 
40 1 8 60 

cacattaatt tcactattgg gcgccagtcc tatactctta gacagctgga taatgctgtg 
1920 

tctgctttta ggcatattcL aattaatgaa agtaaacaat ctgctgctca acagggggct 
45 1980 

ttcctcagag aatatcttta tgtttacaag aatgtaagtc agctgtcacc agatggtcct 
2040 

ttgccacagc ttcctttacc gtatattaac agttcagcaa cacgggtttt ttttggccat 
50 2100 

gacagacgac cagcggatgg tgaaaaacaa gcagctactc atgtaagtct tgatcaagaa 
2160 

tatgaltctg aatcctctca gcagtggcga gaacttgagg aacaagttgt ttctgtggtt 
55 2220 
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aacaaaggag taactccatc caatCCtcat cccacacaat actgtttgaa cagtcactca 
2280 

gataattcaa gatttccact tgcagttgta gaagaaccaa ttacagtrgga agtggctttt 
* 2340 

agaaaccctt tgaaagttct acttttgttg actgatttgt cattgctttg gaagtttcat 
2400 

cctaaagatt tcagtggaaa ggataatgaa gaagttaaac aactagttac aagtgaacct 
2460 

gaaatgattg gagctgaagt Catttcagag ttcttaatta atggcgaaga atcaaaagtg 

2520 

gcaagactaa agctctttcc ccatcacar-a ggggagctgc atattctggg agttgtttat 
'5 2580 

aatcttggca ctattcaggg ctctatgaca gtagatggca ttggtgctct tcccggatgt 

2640 

cacacaggaa aatattcctt gagtatgtca gtccgaggga agcaggattt agaaattcaa 
20 2700 

ggtcctcgac ttaacaacac aaaagaagag aaaacatctg ttaaatatgg ccctgatcga 
2760 

cgtttagatc ccataatcac agaagaaatg ccactgttgg aggtgttctt tatacatttt 
25 2820 

cctacagggc ttctctgtgg agaaatccga oaagcatatg tagaatttgt caatgtcagc 

2880 

aaatgtccac ttactggatt gaaggttgtt tctaaacgtc cagagttctt tactttcggt 
30 2 94 0 

ggtaatactg ctgttctaac accactaagt ccctcagctt ctgagaattg tagtgcttac 

3000 

aagactgttg tgacagatgc tacctctgtg tgtacagcac tcatatcatc agcttcttct 
35 3060 

qtagactttg gcattggcac aggaagtcaa ccagaggtga ttcctgttcc ccttcctgac 
3120 

actgttcttc tacccggagc ctcagtgcag ctgccaatgt ggttacgtgg gcctgatgaa 
40 3180 

gaaggtgtcc atgaaattaa ctttttgttt tactatgaaa gtgtcaaaaa gcagccaaaa 

3240 

atacggcaca gaatattaag acacactgca attatttgta ccagtcggtc tttaaatgta 
45 3300 

cgggccactg tctgcagaag taattctctt gaaaatgaag aaggcagagg aggcaatatg 

3360 

ctagtctttg tggatgtgga aaataccaat actagtgaag caggcgttaa ggaattccac 
50 3 4 20 

ataglcjcaag tatcaagtag tagcaaacac tggaagttac agaaatctgt aaatctttct 
3480 

gaaaacaaag atgccaaact tgccagtagg gagaagggaa agttttgctt taaggcaata 
55 3540 
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agatgtgaga oogaagaagc ggccacacag 
3600 

atctttqqaa atgaacagat aataagttca 
3660 

agtttatctt. ctgaattgaa aaaaccacaa 

3720 

tcaacagagg atgctgtgag attgattcaa 
3780 

atattatgga aggcatacgt tgUggaagac 
3840 

catgttalLc ttcgcactat aggaaaagaa 
3900 

ccagaaatgg aactattgaa atttttcagg 
3960 

tcagtagagc agctttctaq tctcattaaa 
4020 

catccatttc atcaaaaaag cctttgttta 
4080 

Uctaaggctg atgtagatgt catagttgat 
4140 

ctggaaatcc atggatcatt cacatggctt 

4200 

agccaggaga ttcacagtct gcagctgaaa 
4260 

cttggaactc ctagggtatt tgccaagtta 
4320 

cagcagaatt ccatgcctgc cctgatcatc 
4380 

tactgaaatc cacaataatc agtttttgct 
4440 

ctaacctgtt atggaggttg aLtgatatct 
4500 

ttgttgatga Ugcaaagcac gttggactga 

4560 

ttaaaccctg agaataattt acatgctcat 
4620 

tattaagccc catcttaaga aaacacaaag 
4680 

tatacgaata ttaggagatq attctgagaa 
4740 

aagccattag tctctaaatt ccagctttac 

4800 

ttttcctgtc. ttgcttcaca cagttcctta 
4860 



tcctctgaaa aatatacctt tgcagatatc 
gcaagcccat gtgcagactt cttttatcga 
gctcacttgc ctgtgcatac agaaaaacag 
aaatgcagtg aggtagattt gaatattgtc 
agtaaacagc ttattttgga aggtcaacat 
gccttttcat atcctcagaa acaggagcca 
ccagaaaaca tLacagtttc ctcaaggcca 
acgagtcttc actacccaga atcatttaat 
gtaccagtca ctcttttact ttccaattgt 
cttcggcata aaacaacaag tccagaagca 
ggacaaacac agtataaact tcaacttaaa 
gcatgctttg ttcatacagg tgtttataac 
tcggaccaag ttacagtgtt tgaaacaagt 
atcagtaatg tgtgacaact tggaaatttg 
ggatgggtLt tacagcagta tttgatatac 
gatccctgca aaatactttg acttgtcatt 
gaatacttaa cattcttttt ctgtatttct 
aatacaggat atcagcatat ttgtgcacct 
tctaagtctg ctgttacaac ttgtcaatgg 
aggaaaggcc ttgttggcag tactcctgtt 
tgtgaagttc tatagagtgt taaatacaaa 
aaatcagttt tgaactttgg tcatagagtc 
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LLcdLaLtLc agUatttggt ggtccctatg acttatacat aactttgt 
4908 



<210> 17 

<2ai> 435 

<212> DNA 

<213> human 



<220> 

<221> niisc_feature 

<222> (30).. (30) 

<223> any kind of base 



<220> 

'5 <221> misc_feature 

<222> (49).. (49) 

<223> any kind of base 



<220> 

20 <221> misc_f eature 

<222> (75).. (75) 

<223> any kind of base 



<220> 

25 <221> misc_feature 

<222> (76).. (76) 

<223> any kind of base 



30 



<220> 
<221> 
<222> 
<223> 



misc_f eature 
(78) (78) 
any kind of base 



<220> 

35 <221> mi3C_feature 

<222> (79).. (79) 

<223> any kind of base 



<220> 

40 <221> misc feature 

<222> (109)'. - (109) 

<223> any kind of base 



<220> 

45 <221> misc_feature 

<222> (136) . . (136) 

<223> any kind of base 



<220> 

so <221> misc_feature 

<222> (137) . . (137) 

<223> any kind of base 



<220> 

<221> misc_feature 
<222> (149) . . (149) 
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<223> any kind of base 



<220> 

<221> misu feature 

<222> (227) . . (227) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (236) (236) 

<223> any kind of base 



<220> 

<221> misc^f eatnre 

<222> (246) , . (246) 

<223> any kind of base 



<220> 

<221> misc_f eature 

<222> (342) . . (342) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (363) . . (363) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (389) . . (389) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (426) . . (426) 

<223> any kind of base 



<400> 17 

ggtagaaatg 

60 

gactcacatt 
120 

gattgctctg 
180 

cccaaagctc 
240 

gcatgnggtt 

300 



attgtgatgt acaaattttn tattttgatc atacttaana agacagagca 



cattnncnna atagtatcac tgtacacata gcgaatttnt ggcgctttta 



aaaatnnctg aagagttgnc catagcagcc tggtaagcct tttcctttcc 



tcctgccctt tgcagaaaga ctgttggtga caactgntgc taactnaata 



gaacttcgcc aaaatccttc cacctccLcc catagggcaa caggggtgac 



ttgggcttaa 
360 

ttntagataa 
420 



agggcattga gtaagcaagt aggttatcag anaacagagg gaagattcca 



tttccaaata ttacaattng tggaactcag agttcaactg ctcagttcct 
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tcttcngctg accct 
4 35 



<210> 18 

<211> 2224 

<212> DNA 

<213> human 

<400> 18 

ctttagatct gtgcagcctt 
60 

gtacttgcta taaagccacc 
120 

agtagccagg ccagtaggca 
180 

tgcacctgga caggtgtacg 
240 

cacttactgt actgtgtcgt 
300 

ctggtcgtgt aaactttttt 
360 

ctctaggaag ttggtggcaa 
420 

ccacctcaaa tgacaataaa 
480 

caggactaat ccatcgatga 
540 

ttgggagcag agtccagtgg 
600 



tcagccggcg aggcggtccc 
660 



ggtgcccgtt tggtggcagc 
720 



atgaaatgtt gccagagcag 
780 

caagtgagca aggtggcagg 
840 

agcaggggag gctgctacgg 
900 



ctgtgctgtt cccttcctgg 

960 

tgaccctccc accccctgct 
1020 

gLcagccttt caataaaagt 
1080 

cttcactttg ttctttttct 



tgcgtgccaa acttgtgaaa 
tgtcaacaaa cccccattat 
gttggggaag gtgggaagga 
tctgcaccca tcaccctcag 
ggaaggatat gctaagtgat 
ttttttttgg aaattgaagc 
gggacagcac tcacactctt 
aaactggtcc aacgaagaca 
ctggaaaaga ggctagcttt 
gtgtgaggct gacttgccga 
acagctctcc tcccagggca 
ttcagcctag ggatacctga 
caacacttcc ctgtgggcac 
acccacccaa gcctgatacg 
ctgcccactt cccagcaccg 
ctaataggga gacccttcgc 
actctcctcc acacacccct 
tatgcacaaa tgtgaacacc 
gaagtcaaac tcttatcaaa 



ttccttttac cttttttgga 
gtacagaata ggacctatcc 
tccagcgagg cccjcligagcc 
caccaggcca ccctgcagtc 
gaaagttgcg agcagtctca 
tgtagagtgc tgcccgaaat 
ctggtcatga tctctgatct 
ctgctcagca cttcagccat 
gaggaaaaca gcctgggctc 
cggtcggcag gtaatggctc 
gcctgaggag gaggaggccg 
agctgttgag caacaccttt 
agccccggga aatccggtac 
catctgggcc cgccgggctc 
tctgtcaggc ttgaacccct 
aggcacccac tgtttcaact 
ccgttccgct agcctaccct 
tgagatggag ctgaacattt 
tgccctaaaa ttattaccac 
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1140 

ccaagagaaa caggaaaaag gttacatgtt tttgttcact gagagtaaga tcacctgcat 
1200 

ctggaagacg ggctggtaaa ttggtttggc tacagaacag aaagaaaaca aaaacaaacc 
1260 

tcgtaaggga agtatcgcac tcagacacca ccacttccta gagccaaatg agcaatccca 
1320 

aactgcaagt gccgtaagtg ggcctgcgac gtcacaccgc ccggcccgag gtatcgcatg 
1380 

tgcgggggag gcccacacta cagctgtcct ctcgtcUaga aggcaccacc tcgctttcat 
1440 

gtcccgtgtg ttttgyaaaa gcagtatggt gtgtcatgtc tagcggcgaa cacttccctc 
1500 

cctctgtcct tgaggttgta atataaaaac tgtgtttctg tacgtgtggg tgggaattct 
1560 

ctgacggtgc tcgttcatag cacaagctta cgctgagttc Lgaactgtcg ttcacagctg 
1620 

cgtgtctgca tqgtgtcgca tctgttgtac ctttggggaa aatttgtatg taaatgtaca 
1680 

gaaataaaaa cgttgcccca ttaacagatt tcctctggaa tgtcttccct acctcacctg 
1740 

atggtafccca ccgaagggca tttcactacc attaatggtg agtaataaaa tcctccgtgt 
1800 

tcattcogac ctcactgcgt cactactttg aacgcctctg taagctgtgt cttcacccgc 
1860 

cccgaggtgg gtggagggag gcctctcact ctgcttcgag tcctggtctt aaaggtagtc 
1920 

agaggcagag gctggattaa acacacactg tttaccaagt gccactctca gaccacctga 
1980 

gagacggggg gccatcagta aaattaagag gaattttttt cccttgttcg tgtatgttct 

2040 

gctgatccgt ggcctgaagg ttcctagaga cgtcaagaaa Ugaatatctt acactgtgat 
2100 

tctgtgagga aagactggta acccaaaact ctcttctcta atgtattttt taacgaaaat 
2160 

gacaatattt ctttaataaa gtatttatac caaaaaaaaa aaaaaaaaaa aaaaaaaaaa 
2220 

aaaa 
2224 



<210> 19 

<211> 2244 

<212> DNA 

<213> human 

<400> 19 
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gttgtttaaa agcaaggcat 
60 



aagctgctcc ctggttccac 
120 

ttttttccct cctctttttt 
180 

tttttaattc attaaccagt 
240 



ccacttccta gatctagtLL 
300 

tagtaaatgc ctcatttaat 
360 

tgagtagatc cagtatttga 
420 



tcttttctag gacgcactat 
4B0 



gattttttLg ggaagttaat 
540 



cctttgaaat gcaccaaatg 
600 



tcattgtctt atatgcttag 
660 



cagctgaaga cctgttatgt 
720 

tggaatgatt cacatcctct 
780 

gggtagggtg ggggtgggga 
840 



tatttctgtt ttgcatatga 
900 

aaaatcatgg aaagacLctt 
960 

cttattttgg ttgcagtttc 
1020 



aaacctaatt cttgtagatg 
1080 



atatcatctt ttcagatgta 
1140 



acacaattcc taccaataat 
1200 



ttggaccaca gtgcatgaaa 
1260 



aacaatr.cag agaaaagaaa 
1320 



gcttgtggat gactctgtaa 
tctggagagt. aatctgggac 
tgggggggag tgtgtgtggg 
ggttaqcctt aaggggagga 
agaaaacatg ttccccatct 
aacatactcc tttttgaaag 
tgaaactcat gaaagtgggc 
atgtgactgt gactttaagg 
ttctaacttc tntcactgat 
aattgagttt gtaattaaaa 
catagatttg cagctcagta 
agaggaaata cgaggggtgg 
caagttagga ggatggaggc 
gaacacttaa caacatgggg 
ggaaccctag agcagccagg 
aatgcagact cttcttaagt 
caatttttaa aaatgttgag 
cattagtgtt gaaccaatgc 
ttaacaaaca aaacctCaaa 
aaatcgatca actctatcta 
acttcaacat tctgttatta 
attgcaataa taaaaggtaa 



cagactaatt ggaatt.gtt.g 
atcttagtgt tttgttttgt 
gtttqttttt tagtcttgtt 
ggacggattg attccacatt 
ggtgctctta ggaaggagta 
ttgccttttc tctccaccct 
ggagcccatc ttccccctcc 
acatttgttt gccatttgct 
aaatgaagaa aagtattgca 
aaattttttt tccctttcag 
gtaCatgtgt tcctagaatg 
tgctagaaga cagacatctg 
ctgcttcatt aagaagctgg 
accagtcagg ggaatcccct 
tgaggctctc tagtttaata 
gttaataggg attttttcag 
gtaatctttc ccaccttccc 
ttctcatgtc tcaatcttgt 
aagagtagat gaattgccaa 
ttcaggaaag caggaagcat 
gataatgaat caaccaaatg 
attaacagaa agataatata 



57 



EP1 355149A2 



agcaagatag taatagttga ccattctgaa aagcttataa catcactcat catccagcat 
1380 

cctttctgaa aacaaaggat ttttaaatca ctttatgcac atatacaaca taggaggctg 
1440 

gcaaaataat gcactatttc ttaacagcca tgtcCcttgt agaacttcaa gttaatctac 
1500 

aaatgaccat tgtgtcttaa tttagattat gaaCaccaca ttagtcaggt atttgcacta 
1560 

acccttaata gtatatacag tttctatgga aaattcagtg gtccaaaaat ttccgtagaa 
1620 

tttgagagga cgttggtggg ctgaagotag ctccttgagg gtcactgatg taggctgcaa 
1680 

tgggggttca caaggccctg acaccgtatt tatagtctaa cctttttatg aaaatctgac 
1740 

tacagctatt taaggagtag tcttaatagc tgaaaatgaa gatagagaaa gacaccaaga 
1800 

atatgacaca gtttacattc tagtgaggga cacaacaaaa tcaaatttaa aaaagagtgt 
1860 

aatagatgct gataaatact gtagataaag cacataagaa aatagaaata aagqctgtca 
1920 

atggagaagt catgattttt attttattta tttatttatt tatttgagac agagtcaggc 
1980 

tctgtgcagg ctggagtgca atggtgtgat ctcgctcact acaacctctg ctcctggctc 
2040 

aagctatcct cccacctcag ctctcaagta gctgggatca caggtgcgtg ctaccatgcc 
2100 

cggctaattt tttgtagaga tgaggttttg ccatgttgcc caggctggtc tcgaactcct 
2160 

ggactcaact gaccccacct cggcctctca aagtgctgag attataggcg tgcagccggc 
2220 



agctggccat tgtttatgtt ctgc 
2244 



<210> 20 

<211> f 351 

<212> DNA 

<213> human 

<220> 

<221> misc__feature 

<222> (62).. (62) 

<223> any kind of base 



<220> 

<221> misc feature 

<222> (121) . . (121) 

<223> any kind of base 
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<220> 

<221> niisc_feature 

<222> (207) . . (207) 

<223> any kind of base 



<220> 

<221> miscfeature 

<222> (220) . . (220) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (276) . . (276) 

<223> any kind of base 



<220> 

<221> misc__feature 

<222> (300) . • (300) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (315) . . (315) 

<223> any kind of base 



<220> 

<221> inisc_feature 

<222> (336) (336) 

<223> any kind of base 



<400> 20 

tctacttcca catcggcgag accgagaagc gctgtttcat cgaggaaatc cccgacgaga 
60 

cnatggtcat cggcaactat cgtacccaga tgtgggataa gcagaaggag gtcttcctgc 
120 

nctcgacccc tggcctgggc atgcacgtgg aagtgaagga ccccgacggc aaggtggtgc 
180 

tgtcctggca gtacggctcg gagggcnctt tcacgttcan ctcccacacg cccggtgacc 
240 



atcaaatctg tctgcactcc aattcttacc aggatngctc tctttcgctg gtgggcaaan 
300 



tgcgtgttgc atctngacat ccaggtttgg gggagnatgc caacaaatta c 
351 



<210> 21 

<211> 2631 

<212> DNA 

<213> human 

<400> 21 

accttccaac ccagccctcg gctgagccgc gccgcaccat gcccgccgtg gacaagctcc 

60 

tgctagagga ggcgttgcag gacagccccc agactcgctc tttactgagc gtgtttgaag 
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120 



aagatgctgg 
180 



atggayccca 
240 

atgaaaaaca 
300 

atttttccaa 
360 

cagacacaat 
420 



ctttaaagga 
480 

gcaggctgcc 
540 



ccgcggcccg 
600 

tgcagtacag 
660 

ttaacttttt 
720 



ccgttgcaga 
780 



ggtgtcccag 
840 



ggccgcacca 
900 



caaaacaggg 
960 



tctcatgtgt 
1020 



ctcagtgatg 
1080 



tggaaaatcg 
1140 



tgcaataaac 
1200 

caagttgaat 

1260 



agaaagctca 
1320 



gattgttccc 
1380 



gccgattcaa 



caccctcaca 
gaaL.gagatg 
gaactttgct 
agtggtggat 
ggttctacct 
tctatttgga 
taagaaaaag 
gcggaagcag 
aaagcaaatg 
taagaaggga 
catggttcaa 
caagaattac 
cagat caaca 
ctggtcaccg 
cagcccaggg 
gccgtggatt 
ggaataatcc 
aacatctcca 
cagaccgctc 
tgccccaqcc 
aaagcaacag 
ttcgatattg 



gactatacca 
tgcctggcca 
cttggcaaag 
gagcttaatc 
atcatacaat 
ctcgctagca 
gagaatgaga 
cacctctcct 
gccatgat-.gg 
gcagagatgt 
agcattcagg 
tttctgttga 
ggaacctcat 
ccacctggga 
gagccgtggc 
gcgaagaccg 
tccaggctga 
gacagatcta 
tgcaagcagt 
agaacctgaa 
ccagtctacc 
tgcttcctgc 



accagctgct 
cacaacagct 
gtgatgaaga 
ttctccatac 
tccgagaaaa 
atgagcatga 
aggtgaagac 
cccL tcagta 
agcccatgat 
tttccaaacg 
tagaactgga 
tgaatctgtt 
ccagaaggct 
gaggctttat 
tggaggtttg 
gcgctactgc 
gagcagaaag 
cctgaccgac 
gactcctatt 
aaattcagag 
tgaagcagag 
tacagaattc 



ccaggcaatg 
ttctaagcaa 
agtaatttca 
agagctggct 
ggatctcaca 
cctctcaatg 
cgaagtcgga 
ctactgtgcc 
aggctttgcc 
tatggacagc 
accgaggcgg 
tacactccag 
ggttacctta 
ttcttcaccc 
atccaggacc 
tttcagatca 
gaaaatgaag 
aaccctgagg 
acaagttttg 
atggaaaatg 
gagctgatcg 
cttgatcaga 



cagcgcgtct 
ctgccggcat 
acactccact 
aaacagttgg 
gaagtaagca 
gcaaaataca 
aaagaggtgg 
ctcaacgcgc 
catggacaga 
tttttatcct 
aaaagatgcg 
actctgatgt 
atcttagaaa 
aaggcgggaa 
tggacaactg 
ccacgcccaa 
agtggatatg 
cagtcgcgat 
gaaaaaaaca 
aaaatgacaa 
cgcctggagc 
acagagggag 
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1440 

caggcgtacc aacccttttg gtgaaactga 
1500 

tcttttgcag cagatgttta tagttcggtt 
1560 

cactactgaa gtgatttatg aagcgatgag 
1620 

catcttccgc atgacagaat cccatctgat 
1680 

tccacagact caagtatcaa gggccaattt 
1740 

tcatcaagaa aacaagagac cggttggttt 
1800 

agaatctctg agtacataca tttttgaaag 
1860 

tattaatttg ggaaaagaaa ttaUtgaggt 
1920 

aatgctgtcc ataccactaa ccaatgatgg 
1980 

tgacgatgat ggaaatccaa atgaacatag 
2040 

cgcctgtggg ggaagagcga acaggaagga 
2100 

gacatacagg cacactgacc tgatttccga 
2160 

tgatgccttg atactgagac ttgggaggga 
2220 

catctacaat gttattttag gtgctttgtg 
2280 

tttcttagat tgttcagcgc tcagaacaaa 
2340 

tcatctcttt tcagtttcca gtatcctttt 
2400 

ttgatgaaca ctaaatattt cttattaata 
2460 

tttaagtgcc tgtcgttctg aaaattgtgt 
2520 

cctaatagca tttctttgtg cagttaggtg 
2580 

tagagatgcg cgatacaggt ctagtttcgg 
2631 



<210> 22 
<211> 2851 
<212> DNA 



ggatgaatca tttccagaag cagaagattc 
tttgggatca atggcagtta aaacagacag 
acaagtattg gctgctcggg ctat-.tcataa 
ggtcaccagc caatctttga ggttgataga 
tgaacttacc agtgtcacac aatttgctgc 
tgtcatccqt gttcctgaat ccactggaga 
caactcagaa ggcgaaaaga tatgttatgc 
tcagaaggat ccagaagcac tggctcaatt 
aaaatatgta ctgttaaacg atcaaccaga 
aggcgcagaa tccgaagcat aactcacttg 
gagctacctc ctaagggttt taacgtctct 
aggctgacaa tcgtttgtgg aatgtaatct 
aactaagaaa tggttgacag cgttcccacc 
gtaagtcttt tttcttagat tgcgctaaaa 
agtttgaaaa atgcattgtt catatgaatg 
taaaaaatgg caaaagccta gatttacaat 
taatctattt ttgtatttta cttaatgagc 
atttataatt cagcttatcL cataattgga 
atgagcactg ctttgaggcc caagcactag 
taactgttcc agacatcaag c 
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<213> human 
<400> 22 

agcatctcag gccatcatcc tgaaacttgg cagccttcgt ggagtataag gacagcatta 
60 

ttagccatca ttgggtttac tgccaacaaa aggagaggga gccataggtt ctctagatta 

120 

cactcctgag gaaagaagag cacttgccaa aaaatcacaa gatttctgtt gtgaaggatg 
180 

tggctctgcc atgaaggatg tcctgttqcc tttaaaatct ggaagcgatt caagccaagc 

240 

tgaccaagaa gccaaagaac tggctaggca aataagcttt aaggcagaag tcaattcatc 
300 

tggaaagact atctctgagt cagacttaaa ccactctttt tcactaactg atttacaaga 
360 

tgatatacct acaacattcc agggtgctac ggccagtaca tcgtacggac tccagaattc 
420 

ctcagcagca tcctttcatc aacctaccca acctgtagct aagaatacct ccatgagccc 
480 

tcgacagcgc cgggcccagc agcagagtca gagaaggttg tctacttcac cagatgtaat 
540 

ccagggccac cagccaagag acaaccacac tgatcatggt gggtcagctg tactgattgt 

600 

catcctgact ttggcattgg cagctcttat attccgacga atatatctgg caaacgaata 
660 

catatttgac tttgagttat aatatggttt tgtgacttat gagctgtgac tcaactgctt 
720 

cattaaacat tctgcattgg gtataatcta agaattgttt acaaaaagat tattttgtat 
780 

ttacccttca ttcctttttt tgatccttgt aagtttagta taaatatatc tagacattca 

840 

gactgtgtct agcagttacg tcctgcttaa agggactaga agtcaaagtt ccttgtctca 
900 

ctatttgatc tgctttgcag ggaaataact tgttttttct catgtttcat cttcttttta 

960 

tgtaaatttg taatactttc ctataULgcc ctttgaaatt tttggataaa agatgatgtt 
1020 

ttaagttcca atgagtatta ctagttactc aataccactt attgagtact ctgtttctac 
1080 

gtatgtagaa tgtataggga tagaagagtt gaaaagggaa agcaaaactt cttaagtggc 
1140 

ttccttaaaa tgtcattcat aggagatgta ctggaattgc tcattctgtg actttatttg 

1200 

tgtcctaaac attcttcagt gaaaataatt ttatttcagt caaacattta tgaggaaatg 
1260 
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agatcacatc tttgtcactg 
1320 

atgctgttat caccaccccc 
1380 

aaaacagtct tccatagatt 
1440 

ttcttccaga agttaaatqg 
1500 



tatgtctttt gggaatgtat 
1560 



ttcccaggaa tatgcataat 
1620 

gcccaagtaa ccagtgatag 
1680 

cagagcccaa gaaagaattt 
1740 

ccacagactc tagtagataa 
1800 

aaaaacattg ccttcagcat 
1860 



taatgaagat ttaattttta 
1920 



aaagtgttat ttgtaagtta 
1980 



gttctagatc ggtcrracgaa 
2040 



tattcataaa gtccttatgt 
2100 

taaaatcgtc ttaaaggcaa 
2160 

aactgtcaga cacaaLLtct 
2220 

aaaataagaa catattgtac 
2280 



tcgcatttta aaggtgttta 
2340 



ctaaaaagct caaggacttt 
2400 



tttataactc tattgccata 
2460 



aaccttgatt cagtgctcag 
2520 



agaagaaaga aatccccacc 
2580 



gatgctactt gaagagggag 
tgccctctgc tgccataatc 
tttaaggaag aaaggqccca 
ggggatctga agatttgaat 
tatatgccta gctttataat 
attgaatatt tcatgtccta 
aagttagaaa aaccccttta 
tcagtggaaa aatcaatata 
tattatcatc ataatggctg 
gttcagttcg cagcactgag 
aatacaggtg gttccaagct 
atttttttac aagtcaaaca 
agttagccca tatgtatatc 
ggtcttaact aagtgaaatt 
atttaatttt tacccctgtt 
gttttcatct gagagccagg 
actattatat aatacagaat 
caggattatt ttttatatct 
atgaagatct cattatatga 
agaaaataca ctctaaaatc 
tggtctccta gtaagaagtc 
acctcaacct ctgctgagat 



tactttgtaa ccactttgat 
acacaaattt aaaaagaaag 
agtcaggaga tcgcttggtl 
gttcggtctg ctttgaaatg 
caggtataaa attttaatta 
ttttaataga aaacctcagg 
cttagaattg tccacctagt 
taacttagtg ctagctagcg 
gtgaaaccat ataatcacag 
ggcactcttg agggtgttgt 
ttcaaatagg ttatgctcca 
atgttggaag tggtatttag 
ttgaatagta taggggaggg 
atggacaaga gaaataattg 
tatgggacat tcgttctatt 
tttcctttat ttctacatct 
tgtcttacac tttaataaat 
gtagctgaat ttgttaaagt 
ggaaaatcat aggttaccat 
ttgatttgaa acatattaga 
accgacggta gcgtcaCatg 
tgtgtgctag gaacagcctt 
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ccctccgttt ccccLcagtc aaacttgagc 
2640 

tgtttccatg gggtgtacct atactttaag 
2700 

ataaaaagcc aagaagaaaa aaaaaatttt 
2760 

gcatctcttc ccccacctgt cagctagcca 
2820 

cacctgaagc accatctgaa aggggcacca 
2851 



<210> 23 

<211> 3473 

<212> DNA 

<213> human 

<400> 23 

aagagcagcg gcgaggcggc ggtggtggct 
60 

gctctagggg ttggcaccgg ccccgagagg 
120 

ctgctgtgtg cggtgctgct gagcttggcc 
180 

gatgaatcct tagattccaa gactactttg 
240 

actgcaggca gagtagttgc tggtcaaata 
300 

tcctctattc aagaagagga agacagcctc 
360 

gatatcagct ttctagagtc tccaaatcca 
420 

gtacggaaac cagctttgac cgccattgaa 
480 

ccttttcttt tcctagataa ggagtatgat 
540 

agactgtggt gtgctacaac ctatgactac 
600 

actgaagaag aggctgctaa gagacggcag 
660 

ggaatgaaaa tccttaatgg aagcaataag 
720 

ctccaaaagg cagcaagcat gaaccatacc 
780 

ttatttggtg attacttgcc acagaatatc 
840 

actgaggaag gctctcccaa gggacagact 
900 



cogcctctgg atcgatgtga tcttattgca 
ccaatcctgc tgcattcact gctaagttaa 
gcactgtgca gatcctttgc tatctgactt 
cctgcttgtt tgtgttggga tattttttag 
t 

gagtccgtgg tggcagaggc gaaggcgaca 
aggatgcggg tccggatagg gctgacgctg 
tcggcgtcct cggatgaaga aggcagccag 
acaUcagatg agtcagtaaa ggaccacact 
tttcttgatt cagaagaatc tgaattagaa 
aagagccaag agggggagag tgtcacagaa 
gaaaacaagg actatgaaga gccaaagaaa 
ggcacagcac atggggagcc ctgccacttc 
gaatgtacat cagatgggag ggaagatggc 
aaagcagatg aaaagtgggg cttttgtgaa 
atgcaggaag cagaaatggt gtatcaaact 
aaaagccaaa aaagagaagc atatcggtat 
aaagccctgg agagagtgtc atatgctctt 
caggcagcga gagagatgtt .tgagaagctg 
gctcttggct ttctgtatgc cLctggactt 
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ggtgttaatt caagtcaggc aaaggctctt gtatattata catttggagc tcttgggggc 
960 

aatctaatag cccacatggt tttgggttac agatacLggg ctggcatcgg cgtcctccag 
1020 

agttgtgaat ctgccctgac tcactatcgt cttgttgcca atcatgttgc tagtgatatc 
1080 

tcgctaacag gaggctcagt agtacagaga atacggctgc ctgatgaagt ggaaaatcca 
1140 

ggaatgaaca gtggaatgct agaagaagat ttgattcaat attaccagtt cctagctgaa 
1200 

aaaggtgatg tacaagcaca ggttggtctt ggacaactgc acctgcacgg agggcgtgga 
1260 

gtagaacaga atcatcagag agcatttgac tacttcaatt tagcagcaaa tgctggcaat 
1320 

tcacatgcca tggccttttt gggaaagatg tattcggaag gaagtgacat tgtacctcag 

1380 

agtaatgaga cagctctcca ctactttaag aaagctgctg acatgggcaa cccagttgga 
1440 

cagagtgggc ttggaatqgc ctacctctaL gggagaggag ttcaagttaa ttatgatcta 
1500 

gcccttaagt atttccagaa agctgctgaa caaggctggg tggatgggca gctacagctt 
1560 

ggttccatgt actataatgg cattggagtc aagagagatt ataaacaggc cttgaagtat 
1620 

tttaatttag cttctcaggg aggccatatc ttggctLtct ataacctagc tcagatgcat 
1680 

gccagtggca ccggcgtgat gcgatcatgt cacactgcag tggagttgtt taagaatgta 
1740 

tgtgaacgag gccgttggtc tgaaaggctt atgactgcct ataacagcta taaagatggc 
1800 

gattacaatg ctgcagtgat ccagtacctc ctcctggctg aacagggcta tgaagtggca 
1860 

caaagcaatg cagcctttat tcttgatcag agagaagcaa gcattgtagg tgagaatgaa 
1920 

acttatccca gagctttgct acattggaac agggccgcct ctcaaggcta tactgtggct 
1980 

agaattaagc tcggagacta ccatttctat gggtttggca ccgatgtaga ttatgaaact 
2040 

gcatttattc attaccgtct ggcttctgag cagcaacaca gtgcacaagc tatgtttaat 

2100 

ctgggatata tgcatgagaa aggactgggc attaaacagg atattcacct tgcgaaacgt 
2160 

ttttatgaca tggcagctga agccagccca gatgcacaag ttccagtctt cctagccctc 
2220 
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tgcaaattgg gcgtcgtcta 
2280 

ttcacccaac ttgatatgga 
2340 

atcattgcgc tgctgttggg 
2400 



cctgcaccca ggcctccagg 
2460 

cagcagccac cacagtaata 
2520 

tatctgctgg gaacacttgc 
2580 



gaggcacggc acaaggaagc 
2640 

ttcagggata agtaactctt 

2700 

ataacaactt tcagtggctt 
2760 

taatgtctac tgtatccagc 
2820 

taagttctta atgtcaacca 
2880 

gtaaaaagga aaacccagtt 
2940 

ccttttaagt taaaaaaaaa 
3000 

gaaactctta ccogtccaca 
3060 

gagggttggq aggtttctta 
3120 

aagaatgaaa ggccttgtta 
3180 

acttatgcaa aaccttgtga 
3240 

tgtaatttgc ttgtttgttt 
3300 



taagtgggag aaattagaaa 
3360 

agacatactt tCcctaaagt 
3420 

cttttgcaca aagaacacat 
3473 



<210> 24 



tttcttgcag tacatacggg 
ccagcttttg ggacctgagt 
aacagtcaCa gcttacaggc 
gccacggcca gctccacccc 
ggcactgggt ccagccttga 
atttgattta ggaccttgga 
attgaattcc Laaagctgct 
acctaaactg agctgaatgt 
ttttttttct tttctggaaa 
tatctttctt ggatcctttt 
tctttaaggt attgtgcatc 
gcaagtttaa acgtgttcga 
aaaagctatc ttgaaaatgt 
tgcaattaga catattcagc 
ttggtgattg tcacacggta 
aggagttttt tgtgagcttt 
actgactcct tgcactaacg 
tgaatataca gagccttgat 
acaaaacgaa ctctggttgg 
tgaagcattt gttcccagga 
caccatttcc ttttgcacaa 



aaacaaacat tcgagatatg 
gggaccttta cctcatgacc 
aaaggcagca ccoagacatg 
agcaggaggg gccaccagag 
tcagtgacag cgaaggaagt 
tcagtggtca cctcccagaa 
tagaatctga tgcctttatt 
ttgtttcagt gccatatgga 
catatgtgag acactcagag 
ggtcattatt tcagtgtgca 
gacactaaaa actgatcagt 
aagtctgaaa atagaacttg 
tttggaactg cgataactga 
atatttgtta ttttaaaagg 
taccatactc ctctccttca 
acttctttgg aatggaatat 
cgagtttgcc ccacctactc 
ccagaagcca gaggatggac 
ggtactacga tcacagacac 
tttattttac tttgcatttc 
agaacacatc acc 
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<211> 401 

<212> DNA 

<213> human 

<220> 

<221> misc_feature 

<222> (252) . . (252) 

<223> any kind of base 



<220> . 

<221> misc_feature 
<222> (303) . . (303) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (390) . . (390) 

<223> any kind of base 



<400> 24 

ttagattatt ttcaatLUat tatt.cagaat 
60 

tagttattga attgtattgg tttaaattaa 
120 

caagagatac aaaaggaaat tgagtgaaaa 
180 

gtctctacct agaggcaatt attgtcaaca 
240 

attaagcctg gngtactaga tctcttttaa 
300 

aangttcaat tactagtaac accttattac 
360 

ttcatggacc attgatgtca tttggattcn 
401 



<210> 25 

<211> 1820 

<212> DNA 

<213> human 

<400> 25 

aatgtcttag aaaaaggctt tctaaaagaa 
60 

agatacaaag aacttcagga aaaacataaa 

120 

cacgaagccc tcagcattat tgtggatgaa 
180 

caacaagtag aagctattga aaaacagtac 
240 

tgtgaggagt tgctaaatgc tcagcatcag 
300 



aaatatatct tttttcttta acttctcaaa 
atgcgtcatg tgtatatatc agtattaatt 
ataagtctgc ctccttccca tcactctcat 
gtttttgatg tgtctttcaa aaaatagtcc 
aagtttacaa cctgttacag aatatatata 
agatacagat tacaacttag gaaatatatt 
cccctacaat c 

aaagagcaag aggccatttc ttttcaagat 
caagaattgg aagacatgag gaaagctggt 
tataaggcac tactgcagtc ttcagttaag 
atttctgcaa ttgagaaaca ggcacacaag 
aggctccttg aaatgctaga tacagagaaq 
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gaactgttaa aagaaaaaat aaaggaagct ttgattcagc aatctcaaga acagaaggaa 

360 

ataUtggaaa aqtqtttgga ggaagaaagg caaagaaata aagaggcatt agtatccgct 
5 420 

gcaaagcttg aaaaagaagc agtgaaggat gcagttttaa aagtcgtaga agaagaaaga 
480 

aaaaatttag aaaaagcgca tgctgaagaa agggaattat ggaagacaga acatgcaaaa 
10 540 

gatcaagaaa aagtatctca ggaaattcaa aaagctatac aagaacaaag oaaaataagt 
600 

caggaaactg ttaaggcagc aataatagaa gagcagaaac gaagtgaaaa ggctgtggaa 
15 660 

gaggcagtga aaagaacaag agatgaattg atagagtata taaaagaaca gaaaaggctc 
720 

gatcaagtca tccgccaaag aagcctgtcc agtttggaac tgttcctctc ctgtgcacag 
20 780 

aaacagttaa gtgctttaaU agctacggaa ccagttgaca ttgaataaaa agaacatgac 
840 

aaacccacac tggcattgga taaatcatat tacaccttca aaatacacac tctgaattat 
25 900 

aaagatgtgt ttgttttctt tccaaatcat gtagaattga tttccagttc aaggataaac 
960 

caaaacaata tttagaacta ccaagtgatc taatttattt tcttttggtt tcttctttac 
30 1 020 

atttactgtt attttattat tattagtagt agcagcaaca gagtatgata tgacccaaaa 

1080 

gccattgtaa agtgccacet taccaaaatt aattaagtaa actttatagc ctgtgggagt 

35 1140 

ctattatata ttattttgca aaagtagtaa atatattatt gtttcatgat gactcttgat 
1200 

gagatgctag aatgtaacca tacatttatc ttattttgag gatagaaata gcatggattt 
40 1260 

caacatcact tatttatctg tataattgga aataaaacac cgatatgata gagaatcatt 

1320 

ccggcattac ctaacctctt ctgcagttgg atctatgtat tttcattggt ctactgaaaa 
45 1380 

cgaacaatac aattaaaagc actaaagatt attatattaa ttcaactttg atctgatata 
1440 

tcacttaaac taaaggggtg tgtgtggtqt atgcttgttt cctatttctg ctctttaaag 



atactttgaa tcaataaaac cattagtcta caaatcaaat. tgtgaactta atctctagaa 
1560 

agagaatata actcagccat ttataggaat ttaggttcaa gtacaggata tatqaaatct 
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tttcccagta tttcagaatg tacttaattc acaggcagga tgrttcaatg caaaatcatg 
1680 

aatattttta attcaaaact aaaatgtcat taatatgtat gtatgcaaat gttttatctt 
1740 

attttctgaa atgcatctac tttcatgggc tttgtacgtt tctgagattt ctcagtgtaa 
1800 

taaaaagagc tcccaaactt 
1820 



<210> 26 

<211> 280 

<212> DNA 

<213> human 

<220> 

<221> Tnisc_feature 

<222> (261) (261) 

<223> any kind of base 



<220> 

<221> inlsc_feature 

<222> (237) (237) 

<223> any kind of base 



<400> 26 

tcaagtcata agataaagtt taatcatttg atcatgttaa aagacacaaa acacagccaa 
60 

tctaaccaaa tttcaggcat gcatttacat aaatatatta aattaagaaa agaaattgta 

120 

cacttaaacg tccttttcac ctagaaatca ttaaatccac agatcaacaa taaaaccaat 
180 

tctctgcatt taccacttca agatacaatt gttctatttt aaagataaca caaactncac 
240 

tagtctggtt aggaatttat ntgcattata catatattat 
280 



<210> 27 

<211> 392 

<212> DNA 

<213> human 

<400> 27 

ttggtttgaa atggcacccc aggactttgg gcctgcccta cttgatagcc tcgttcagtg 
60 

agcaaagact tagtgagcag ctcttgtatg ccaagtattt tgctaagctc tggaaaaaag 
120 

ataaacaaga catggttctt gctttcaagg agtgtgtaat tctttagcca gatatggaaa 
180 

cctggaccct gagtgggaga aaggagacag atgaaaggag tccgtgattt tgtaaccaag 
240 

agctgcctgc atggttatga gtatcactga ttttagggac gcccacagag ctaaagcatt 



69 



EP 1 355 149 A2 



300 

tttttaatcc gagaagactt ttgtaactca tattagttaa tcttccagct ctgagatagc 
360 

aacacagctc ttagaattct gtaagtaagc tt 
392 



<210> 28 

<211> 2299 

<212> DNA 

<213> human 

<400> 28 

cgaaccccca cagctggagg gcgaggccng ctgtacccgg ccccagtgcc ctttcgcggc 
60 

cacaagcggc cgtcctcctg gtccqgtgct ccggcgcctg atctaggttc atggagccgg 
120 

ggctgtggct ccttttcggg cCcacagtga cctccgccgc aggattcgtc ccttgctccc 
180 

agtctgggga tgctggcagg cgcggcgtgt cccaggcccc cactgcagcc agacctgagg 
240 

gggactgtga agagactgtg gctggccctg gcgaggagac tgtggctggc cctggcgagg 
300 

ggactgtggc cccgacagca ctgcagggtc caagccctgg aagccctggg caggagcagg 
360 

cggccgaggg ggcccctgag caccaccgat ccaggcgctg cacgtgcttc acctacaagg 
420 

acaaggagtg tgtctactat tgccacctgg acatcatttg gatcaacact cccgaacaga 
480 

cggtgcccta tggactgtcc aactacagag gaagcttccg gggcaagagg tctgcggggc 
540 

cacttccagg gaatctgcag ctctcacaLc ggccacactt gcgctgcgct tgtgtgggga 
600 

gatatgacaa ggcctgcctg cacttttgca cccaaactct ggacgtcagc agacaggttg 
660 

aagtcaagga ccaacaaagc aagcaggctt tagacctcca ccatccaaag ctcatgcccg 
720 

gcagtggact cgccctcgct ccatctacct gcccccgctg cctctttcag gaaggagccc 
780 

cttaggagga caggcctgca gcatcctggt ctcgggaggc ttctgtcatt gctcacacac 
840 

agttcagatt tccacctctt tatagacaag aagtgaattt gcctggggca gaacacccac 
900 

ccaaagagtc cccacttaac aatacccccc ccccacggca agaatgccca aatccgaatg 
960 

accccagttt tcctaatgag taaaatgatc ccagatgtgc cccagagcat gacgcctgca 
1020 
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gctccggttt cotgcaggaa 
1000 

ctgqcttttq acatgacttc 
1140 

tgtaaacaca tgtccatctt 
1200 

tttcatatcc ttgcctagaa 
1260 

ggctttagac acagatcata 
1320 

ggcagagaaa ttttcagctg 
1380 

aggaagaaga aaaaaggatc 
1440 

caatacagag cttgttcctg 
1500 

cqatacacag tggagttccc 
1560 

gctcaaggct attaggttga 
1620 

gcttcaaaat aagtcacgaa 
1680 

aattggcaac aacttatacc 
1740 

ccgagccgag cttactgtga 
1800 

ggggagggct gcccatctcc 
1860 



ggtgtccagg gccccgtaga 
1920 

ccaaatgtta aatcctctgt 
1980 

tatcatgagg aaatgaaagt 
2040 

gattatttat tgtgaaactg 
2100 

gtcactgtat atacgtatag 
2160 



tacaaactca tactccttag 
2220 

caccgtggca agatggtatc 
2280 

tgtcaaaagc ctaataaaa 
2299 
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attggttttg gagagttttg 
tcttggagaa taagtggact 
gtaataaatg caaaatgccc 
taggctgcat ggtqtatgtc 
gctctacagg agtttatgaa 
tgcttgatac ccaccaaaag 
cttgatgttt gtgacaagaa 
tccagtgact gaccctctgt 
aggccttgtt tgcaggaagc 
atatttgctt tcatgagtaa 
cacaaattct ttgtaaatta 
gtctgacagt tcaaaatctc 
gtgtggagat gttatcccac 
ccaacccagt cacagagaga 
gagacattta agatggtgta 
gtgtatttca taagttatta 
ggctgatttg ctggtaggat 
ttctccactc caactccttt 
agaggtagat aggtaggtag 
agcttgaatt acatttttaa 
agagagaaac ccatcaattg 



gcaagttgga aagccactta 
ccaagctaac tctttgcaaa 
gtgcaqcaga agcatgcgac 
agtgagggcc acgaggcgtc 
tttgaagctt atgggatttt 
aatgtatctc gaaagaatga 
aatgagaaag ttagtatctg 
attctgtata gacaccaggc 
cgactgtaaa gacagcccca 
atgtggatct ttggggaatg 
tgtaaattcc tgtttaCata 
tttcagctgc gctcttccca 
catgtaaagt cgcctgcgca 
taggaaacgg catttgagtg 
tgacagagca ttggccttga 
caggtataaa agtgatgacc 
tttgtacagt ttagagaagc 
atgtggatct gttcaaagta 
attttaaatt gcattctgaa 
aatgcatatg tgctgtttgg 
ctcaaatact cagaaagtac 
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<210> 29 

<211> 1339 

<212> DNA 

5 <213> human 

<400> 29 

ctaaacaaaa tcattcactt ccctgatttt gataagaaaa ttcctgtaaa gctgtttcct 
60 

10 ctgcctctcc tctacgttgg aaaccacata agtggattat caagcacaag taaattaagc 

120 

ctaccgatgt tcaccgtgct caggaaattc accattccac ttaccttact tctggaaacc 
180 

15 atcatacttg ggaagcagta ttcactcaac atcatcctca gtgtctttgc cattattctc 

240 

ggggctttca tagcagctgg gtctgacctt gcttttaact tagaaggcta tatttttgta 
300 

20 ttcctgaatg atatcttcac atcagcaaat ggagtttata ccaaacagaa aatggaccca 

360 

aaggagctag ggaaatacgg agtacttttc tacaatgcct gcctcatgat tatcccaact 
420 

2s cttattatta gtgtctccac tggagacctc caacaggcta ctgaattcaa ccaatggaag 

480 

aatgttgtgt ttatcctaca gtttcttctt tcctgttttt tggggtttct gctgatgtac 
540 

tccacggttc tgtgcagcta ttacaattca gccctgacga cagcagtqgt tggagccatc 
600 

aagaatgtat ccgttgccta cattgggata ttaatcggtg gagactacat tttctctttg 
660 



35 



40 



45 



50 



55 



ttaaactttg tagggttaaa tatttgcatg gcagggggct tgagatattc ctttLLaaca 
720 

ctgagcagcc agttaaaacc taaacctgtg ggtgaagaaa acatctgttt ggatttgaag 
780 

agctaaagag tctgcagcag gattggagac tgacttgtga ctgcgggctg ggggggcatt 
840 

cccagtagga atgtgaagcc agaggtttcg gattcgtgac atccaccccc tgggcaagtg 
900 

agagcatctg caaaatgcaa agagaactac ctcatatgca ggatgagcca atggcagtcU 
960 

caagaaatgt actcgggcga caccttacct gtggaaagca aatcttttca aaataagcca 
1020 

ctgggactcg gtaggtggag. ccccagctgc tcttctaggg acctatgggg ccttcgtggc 
1080 

atctctgtgc tgtgtgctgg ggaggaggtt gatgtaatgg tgactctttt ctgaccagca 
1140 

ccttggccgt gattcccaag gtcccagcca aagcaaaggg ccagttgttt cagtttaaac 
1200 
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agacatgtct ttagtctaat aaaattagtt aactgccagt aaagttattt gttagcttLg 
1260 

atgaaagcta tgttggtatc tttccctaat catcaaagta aataaaaaat catttctatg 
1320 

taaaaaaaaa aaaaaaaaa 
1339 



<210> 30 

<211> 4250 

<212> DNA 

<213> human 

<400> 30 

gaacacatcg cgtttgcatc ccagaaagta gtcgccgcga cUatttcccc caaagagaca 
60 

agcacacatg taggaatgac aaaggcttgc gaaggagaga gcgcagcccg cggcccggag 
120 

agatcccctc gataatggat tactaaatgg gatacacgct gtaccagttc gctccgagcc 
180 

ccggccgcct gtccgtcgat gcaccgaaaa gggtgaaqta gagaaataaa gtctccccgc 
240 

tgaactacta tgaggtcaga agccttgctg ctatatttca cactgctaca ctttgctggg 
300 

gctggtttcc cagaagattc tgagccaatc agtatttcgc atggcaacta tacaaaacag 

360 

tatccggtgt ttgtgggcca caagccagga cggaacacca cacagaggca caggctggac 
420 

atccagatga ttatgatcat gaacggaacc ctctacattq ctgctaggga ccatatttat 

480 

actgttgata tagacacatc acacacggaa gaaatttatt gtagcaaaaa actgacatgg 
540 

aaatctagac aggccgatgt agacacatgc agaatgaagg gaaaacataa ggatgagtgc 
600 

cacaacttta ttaaagttct tctaaagaaa aacgatgatg cattgtttgt ctgtggaact 
660 

aatgccttca acccttcctg cagaaactat aagatggata cattggaacc attcggggat 

720 

gaattcagcg gaatggccag atgcccatat gatgccaaac atgccaacgL tgcactgttt 
780 

gcagatggaa aactatactc agccacagtg actgacttcc ttgccattga cgcagtcatt 
840 

taccggagtc ttggagaaag ccctaccctg cggaccgtca agcacgattc aaaatggttg 
900 

aaagaaccat actttgttca agccgtggat tacggagatt atatctactt cttcttcagg 

960 

gaaatagcag tggagtataa caccatggga aaggtagttt tcccaagagt ggctcaggtt 
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1020 

tgtaagaatg atatgggagg atctcaaaga 
1080 

aaggcgcgct tgaoctgctc agttcctggs 
1140 

gcagttacag atgtgattcg tatcaacggg 
1200 

ccttataaca gcatccctgg gtctgcagtc 
1260 

gtttttactg ggagattcaa ggaacagaag 

1320 

gatgaacgag ttcctaagcc caqqccaggt 
1380 

tatgcaacct ccaatgagtt ccctgatgat 
1440 

atggatgagg cagtgccctc catcttcaac 
1500 

taccgcctta ccaaaattgc agtggacaca 

1560 

gtttttctgg gatcagagaa gggaatcatc 
1620 

ggttttctaa atgacagcct tttcctggag 
1680 

agctatgatg gagtcgaaga caaaaggatc 
1740 

tctctgtatg ttgcgttctc tacctgtgtg 

1800 

catgggaagt gtaaaaaaac ctgtattgcc 
1860 

gaaggtggtg cctgcagcca tttatcaccc 
1920 

gagcgtggca atacagatgg tctgggggac 
1980 

atttcaactc ctctaccaga taatgaaatg 
2040 

tccctcttgc ccagcacaac cacatcagat 
2100 

ggaggaatgc tggaclggaa gcatctgctt 
2160 

gcagtgtctt cccataatca ccaagacaag 
2220 

ggccacgacc agctggttcc cgtcaccctc 
2280 

atgggggccg tcttctcggg catcaccgtc 



gtcctggaga aacagtggac gtcgttcctg 
gactctcatt tttattr.tc.aa catt.r.tr.cag 
cgtgatgttg tcctggcaac gttttctaca 
tgtgcctatg acatgcttga cattgccagt 
tctcctgatt ccacctggac accagttcct 
Lgctgtgctg gctcatcctc cttagaaaga 
accctgaact tcatcaagac gcacccgctc 
aggccatggt tcctgagaac aatggtcaga 
gctgctgggc catatcagaa tcacactgtg 
ttgaagtttt tggccagaat aggaaatagt 
gagatgagtg tttacaactc tgaaaaatgc 
atgggcatgc agctggacag agcaagcagc 
ataaaggttc cccLtggccg gtgtgaacga 
tccagagacc catattgtgg atggataaag 
aacagcagac tgacttttga gcaggacata 
tgtcacaatt cctttgtggc actgaatgac 
tcttacaaca cagtgtatgg gcattccagt 
tcgacggctc aagaggggta tgagtctagg 
gactcacctg acagcacaga ccctttgggg 
aagggagtga ttcgggaaag ttacctcaaa 
ttggccattg cagtcatcct ggctttcgtc 
tactgcgtct gtgatcatcg gcgcaaagac 
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2340 



gtggctgtgg tgcagcgcaa ggagaaggag 

2400 



agcgtcacca agcccagcgg ccucttcggg 
24 60 



gccatccLca cgccactcaL gcacaacggc 
2520 



atgctcatta aagcagacca gcaccacctg 
2580 



accccaacgc tgcagcagaa gcggaagccc 
2640 



cagaacctca tcaatgcctg cacaaaqqac 
2700 



acggacctgc ccctgcgggc ctcccccagc 
2760 



acgcagcagg gctaccagca tgagtacgtg 
2820 



atggcgctgg aggaccaggc cgccacactg 
2880 



agcaagagtc ccaaccatgg ggtgaacctt 
2940 

gttccacagc gggaggcctc cctgggtccc 
3000 



agcaagcggc tggaaatgca ccactcctct 
3060 



cccacgaact cgctcacgag aagccaccag 

3120 



tcctccaatt cctctcacct ctccagaaac 
3180 



cccgccccgc agagggtgga ctccatccag 
3240 



gtgactgtct cgaggcagcc cagcctcaac 
3300 



aagcgtacgc cctcgccaaa gccggacgta 
3360 



acatccatga agcccaatga tgcgtgtaca 
3420 



accagcaggc aaggcgaggt gcccgctcag 
3480 



ccaccagacc aagaaggcct gcggcagagc 
3540 



caggggtact cacgaaaact gggccgcgtg 
3600 



caccttcatt ctcttccttc actttccccc 



ctcacccact cgcgccgggg ctccatgagc 
gacactcaat ccaaagaccc aaagccggag 
aagctcgcca ctcccggcaa cacggccaag 
gacctgacgg ccctccccac cccagagtca 
agccgcggca gccgcgagtg ggagaggaac 
atgcccccca tgggctcccc tgtgattccc 
cacatcccca gcgtggtggt cctgcccatc 
gaccagccca aaatgagcga ggtggcccag 
gagtataaga ccatcaagga acatctcagc 
gtggagaacc tggacagcct gccccccaaa 
ccgggagcct ccctgtctca gaccggtcta 
tcctacgggg ttgactataa gaggagctac 
gccaccactc tcaaaagaaa caacactaac 
cagagctttg gcaggggaga caacccgccg 
gtgcacagct cccagccatc tggccaggcc 
gcctacaact cactgacaag gtcggggctg 
ccccccaaac catcctttgc tcccctttcc 
taatcccagg gggagggggt caggtgtcga 
ctcagcaagg ttctcaactg cctcgagtac 
cgaggacgct gggtcctcct ctctgggaca 
gtttggtgaa ggtttgcaac ggcggggact 
acaccctaca acaggtcgga cccacaaaag 
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3660 

acctcagtta tcatcacaaa 
3720 

acacacacac acacatgcac 
3780 

gaaacatttt gtccacaact 
3840 

aaaacacaaa tacatttttt 
3900 

attgataatt ctaactcaga 
3960 

aaatgcccgc cagtgttaca 
4020 

tgtcatagat ttctqctcct 
4080 

actctttatt tatugttcac 
4140 

ctatgaacag ctcttcagaa 
4200 

actggaataa ttgagtttct 
4250 



<210> 31 

<211> 2705 

<212> DNA 

<213> hxuaan 

<400> 31 

ctttagccca acagtcaaaa 
60 

atcatatttc taagttacag 
120 

tgttttttga cagtaaattt 
180 

acattattag agcttcttgL 
240 

cagaaagcct ggatcagaaa 
300 

ctctgaaatc cagtcaaaat 
360 

ccaacgccaa acaaatggga 
420 

tacttggttt cttccttgtt 
480 

ttaggcccta ttctgtagaa 
540 



ca tgagccaa aagcocatac 
ocoacacata cacacacacg 
tcacgggacg tggccagact 
aaaatcaaga aaatttaaaa 
ctttaacaat ggcagaagtt 
gctttccgtt gcagcagaLa 
cctctctttt aatgaaataa 
cct ttttttc cttaaggaaa 
agcccattga aagttaaact 
ttatttttac aataaattca 

ataattgatg ctaccctaca 
caaatattag tcctgctaaa 
gtccttgatt atatattaac 
tgtaggtggg ttaacaccac 
accatcaccc taaaaaaaca 
atgactaaag gcccttgcca 
gcctggttac gagtcagcct 
attgtcataa taaaatgttt 
gtctcctcta ctattcaggc 



ctaccccatc ccccaccccc 
cacagaggtg aacagaaart 
gggtttgcgt tccaacctgc 
agacaaaaaa aaaagaattc 
tactatgcgc aaaUactgtg 
aatgccatgt tgggcaacta 
cgtgaccgtt aacgcaagta 
ggactcttcc aaatatcatc 
atttaacgLg aaatccatta 
ctgagtaaat 

aatgtccaaa actctagtat 
ccagggagct ttggcaaaaa 
tagtcaaaga ggtgtttgta 
caatcaagag gtcattctaa 
tgccttacat atttaacaca 
tgactgatgt attctcctgg 
tcagggactt gtcacatttc 
tctatgctgt ttagtgcaac 
cactcaaaca ccccaaataa 
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r.tgagttcaa aahcgacatc 
600 



ggtattttta ttgattattg 
66U 



gtcagtttcc tctcttgagc 
720 

ttctcacact ctggggccac 
780 



aattaaggac agcgcccatg 
840 



ttcaacctag ctaaccccac 
900 



tgtggtaccc agtcctcagg 
960 



ttcagagctg ttttccacag 
1020 



gagaaaaaga aaactcagta 
1080 

aatggtcctg aacaatggag 
1140 



gttgatatta aaaccagtga 
1200 



tacaacccag ccacagccaa 
1260 



gaggacaacg ataaccgatc 
1320 



tttcagtttc attttcactg 
1380 



ggagtcaaat attctgccga 
1440 



cttgctgaag ctgcctcaaa 
1500 



ggtgaggcca acccaaagct 
1560 



ggcaaacgag ccccattcac 
1620 



ttctggacct accctggctc 
1680 



atctgtaagg agagcatcag 
1740 



tcaaatgttg aaggtgataa 
1800 



cLgaagggca gaacagtgag 
1860 
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aagatataaa ggaatcagtg 
tgctgtcttg acctagtatg 
agctgattaa atccacaccc 
tatgtaccca ctctaatcac 
ccccaaagcc cgccaaaatt 
cctttttgct gtacataagc 
Lgcaaccccc Lgcgtggtcc 
aggtagtgaa aagaactgga 
gaagataatg gcaagtccag 
caagctgtat cccattgcca 
aaccaaacat gacacctctc 
agaaattatc aatgtggggc 
agtgctgaaa ggtggtcctt 
gggcagtaca aatgagcatg 
gcttcacgta gctcactgga 
ggctgatggt ttggcagtta 
gcagaaagta cttgatgccc 
aaattttgac ccctctactc 
tctgacCcat cctcctcttt 
tgtcagctca gagcagctgg 
cgctgtcccc atgcagcaca 
agcttcattt tgatgattct 



actaaatata tttcatalat 
gaggccttgg ctagaggctg 
caaccacttc ccttatcagg 
cacagggcca gacatcagac 
atgcaaatta ttcaaaatta 
tgcccattcc ccctccagcc 
tctgtggcag ccttctctca 
ttttcaagtt cactttgcaa 
actggggata tgatgacaaa 
atggaaataa ccaatcccct 
tgaaacctat tagtgtctcc 
attctttcca tgtaaatttt 
tctctgacag ctacaggctc 
gttcagaaca tacagtggat 
attctgcaaa gtactccagc 
ttggtgtttt gatgaaggtt 
tccaagcaat taaaaccaag 
tccttccttc atccctggat 
atgagagtgC aacttggatc 
cacaattccg cagccttcta 
acaaccgccc aacccaacct 
gagaagaaac ttgtccttcc 
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tcaagaacac agccctgctt ctgacataat ccagttaaaa ta?»t:^;=>tttt taagaaataa 
1920 

atttatttca atattagcaa gacagcatgc cttcaaatca atctgtaaaa ctaagaaact 
1980 

taaattttag ttcttactgc ttaattcaaa taataattag taagctagca aatagtaatc 

2040 

tgtaagcana agcttatctt aaattcaagt ttagtttgag gaattcttta aaattacaac 
2100 

taagtgattt gtatgtctat ttttttcagt ttatttgaac caataaaata attttatctc 
2160 

tttctttctg ttgtgcattc agtttctaaa accattaagt ttctactcca tttacattca 
2220 

aaaatcttaa atactttact tgcaagagta ttttgcttca aatacaacaa cctaagagca 
2280 

gctggagatg aaatattggg aaattcattt gcttactcct gaagacaaaa atatagctga 
2340 

gatgaccact ggatttaata tcgttatgct ggcccaacat tgctaccatt tgtgttgtct 
2400 

gtgatcaaaa tgattatctt ttatatagga agatgacgct tctggatatt gctttcactt 
2460 

cttctcccca cgttagcaag gacaatgctt ctctgccatt attacaacta gttagtttgc 

2520 

atggagaatc tttactttaa aattggaaqa aaagtcacaa gtgaatggtt tataaaaatg 
2580 

ctaaagaagt cattcttgct tagaatcata tagaaacatc atgcaatctt ttagtcagat 

2640 

gtgcgcttca ccutatgcta tttttatctt taattgacac acaataattg tacatgttta 
2700 

tggagtatag tgtggtgttt tctgtttgtt tgtttgtttt ttgagacaag gtctcactct 
2760 

gccagtcagg gtggagtgcg atggt 
2785 



<210> 32 

<211> 9588 

<212> DNA 

<213> human 

<400> 32 

ccgaccaaca ccaacaccca gctccgacgc agctcctctg cgcccttgcc gccctccgag 
60 

ccacagcttt cctcccgct.c ctgcccccgg cccgtcgccg tctccgcgct cgcagcggcc 

120 

tcgggagggc ccaggtagcg agcagcgacc tcgcgagcct tccgcactcc cgcccggttc 
180 

cccggccgtc cgcctatcct tggccccctc cgctttctcc gcgccggccc gcctcgctta 
240 
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tgcctcggcg ctgagccgct 
300 



cacccgcgga tcaacactct 
360 



tacgaggtga ccagcggcgg 
420 



atcaccgacc agaactcgga 
480 



cagaacacca tccaggagct 
540 

atcgtgcagc ctgaattgaa 
600 



gatgagtgtt ttgcccaggc 
660 



atgcggcaga tgggccagcc 
720 



caaatgcgag ccctttataa 
780 



ggtggtggag gctacacttg 
840 



accagtgaat gtttggggtg 
900 



qgtgtggacc tggcctcagt 
960 



atcggcgact atcgctggca 
1020 



atctaccagt tggaggagga 
1080 



cacctgcgac agctgcagaa 
1140 



gactgcgagg aggaggagct 
1200 



aaacaggagg ccttctccat 
1260 



aagctgaaac aagaaagtga 
1320 



gaggcctata tggacactct 
1380 



attqatgttc atctgaaaga 
1440 



actgaagcat acctgaaggg 
1500 



aacatgcccc tgcagcacct 
1560 



ctcccgattg cccgccgaca 
gggccgcatg atccgcgccg 
cgggggcacc agcaggatgt 
cggctactgt caaaccggca 
gctgcagaac tgctccgact 
gtatggagat ggaatacaac 
caatgaccaa atggaaatcc 
ctgtgatgct taccagaaaa 
agccatcagt gtccctcgag 
tcagagtggc tctggctggg 
gatgaggcag caaagggcgg 
ggagcagcac attaacagcc 
gctggacaaa atcaaagccg 
gtatgaaaac ctgctgaaag 
catcattcag gccacgtcca 
gctgtacgac tggagcgaca 
acgcaLgagt caactggaag 
ccaacttgtc ctcaatcagc 
gcagacgcag tggagttgga 
aaatgctgcc tactttcagt 
gctccaggac tccatcagga 
gctggaacag atcaaggagc 



tgagctgcaa cggaggctcc 
agtctggccc ggacctgcgc 
actattctcg gcgcggcgtg 
cgatgtccag gcaccagaac 
gcttgacgcg agcagagctc 
tgactcggag tcgagaattg 
tcgacagctt gatcagagag 
ggcttcttca gctccaagag 
tccgcagggc cagctccaag 
atgagttcac caaacatgtc 
agatggacat ggtggcctgg 
accgqggcat ccacaactcc 
acctgcgcga gaaatctgcg 
cgtcctttga gaggatggat 
gggagatcat gtggatcaat 
agaacaccaa catcgctcag 
ttaaagaaaa agagctcaaL 
atccagcttc agacaaaatt 
ttcttcagat caccaagtgc 
tttttgaaga ggcgcagtct 
agaagtaccc ctgcgacaag 
tggagaaaga acgagagaaa 



79 



EP1 355149 A2 



atccttgaat acaagcgLca ggUgcayaac 
1620 



ctgaagcctc gtaacccaga ctacagaagc 
1680 



gactacaaac aagatcagaa aatcgtgcat 
1740 



aacgagcgca gcaagtggta cgtgacgggc 
1800 



gtggggctga tcatccctcc tccgaaccca 
1860 



cagtactacg aagccatctt ggctctgtgg 
1920 



gtgtcctqgc actactgcat gattgacata 
1980 



ctgaaaacaa tgcggcagga agattacatg 
2040 



caagagttca tcagaaatag ccaaggctca 
2100 



atacagtctc agttcaccga tgcccagaag 

2160 



ggctatcccc agcaccagac agtgaccaca 
2220 



gatgtcaacc ataataaagt aattgaaacc 

2280 



atgctgatgg agctgcagaa gattcgcagg 
2340 



• ctcaaaaacc tccctctagc agaccagggg 
2400 



gagcttaaga gtgtgcagaa Igattcacaa 
2460 



gatatgcttg ccaacttcag aggttctgaa 
2520 



ggactatttc agaaactgga aaatatcaat 
2580 



tgcacagtaa gggcactgct ccaggctott 
2640 



gaagccaggc tcactgagga ggaaactgtc 
2700 



cgctgtggac tgaagaaaak aaaaaatgac 
2760 



atgaagacag aactacagaa agcccagcag 
2820 



ctttatgatc tggacttggg caagttcggt 
2880 



ttggtaaaca agtctaagaa gattgtacag 
aataaaccca ttattctcag agchctctgt: 
aagggggatg agtgtatcct gaaggacaac 
ccgggaggcg ttgacatgct tgttccctct 
ctggccgtgg acctctcttg caagattgag 
aaccagctct acatcaacat gaagagcctg 
gagaagatca gggccatgac aatcgccaag 
aagacgatag ccgaccttga gttacattac 
gagatgtttg gagatgatga caagcggaaa 
cattaccaga ccctggtcat tcagctccct 
actgaaatca ctcatcatgg aacctgccaa 
aacagagaaa atgacaagca agaaacatgg 
cagatagagc actgcgaggg caggatgact 
tcttctcacc acatcacagt gaaaattaac 
gcaattgctg aggttctcaa ccagcttaaa 
aagtactgct atttacagaa tgaagtattt 
ggtgttacag atggctactt aaatagctta 
ctccaaacag aagacatgtt aaaggtttat 
tgcctggacc tggataaagt ggaagcctac 
ttgaacttga agaagtcgtt gttggccact 
atccactctc agacttcaca gcagtatcca 
gaaaaagtca cacagctgac agaccgctgg 
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caaaggat-.ag ataaacagat cgactttaga 
2940 

ttgaggaatt atcgtgataa ctatcaggct 
3000 

cgccaggatt ccttagaatc catgaaattt 
3060 

aatgagcaga agaacttgca cagtgaaata 
3120 

caaaaaattg ctgaactttg cgccaattca 
3180 

tacacctcag gactggaaac tctgctgaac 

3240 

ccttctgggg tgattctgca agaggctgca 
3300 

acaagatctg gagactatta coggttctta 

3360 

aagctgaaaa ataccaagat cgaagttttg 
3420 

aactcggaaa actgtaataa gaacaaattc 
3480 

gagtgttccc agttcaaagc gaagcttgcg 
3540 

ctggatggga agtcggctaa gcaaaatcta 
3600 

aatgagaaga tcacccgact gacttatgag 
3660 

gtggaagaca gatttgacca acagaagaat 
3720 

tgtgaaaagg agaaccttgg ttggcagaaa 
3780 

gagtacgaga ttgaaaggtt gagggttcta 
3840 

tatgaaaatg agctggcaaa ggtaagaaac 
3900 

aacaagtatg aaacagagat taacattacg 
3960 

aaagaggatg attccaaaaa tcttagaaac 
4020 

gatctgaagg atgaaattgt caggctcaat 
4080 

aggcgagctg aagaaaacgc ccttcagcaa 
4140 

aagcagcatc tggagataga actgaagcag 
4200 



ttatgggacc Uggagaaaca aatcaagcaa 
Ltctgcaagt ggctctatga tcgtaaacgc 
ggagattcca acacagtcat gcggtttttg 
tctggcaaac gagacaaatc agaggaagta 
attaaggatt atgagctrca gctggcctca 
atacctatca agaggaccat gattcagtcc 
gatgttcatg ctcggtacat tgaactactt 
agtgagatgc tgaagagttt ggaagatctg 
gaagaggagc tcagactggc ccgagatgcc 
ctggatcaga acctgcagaa ataccaggca 
agcctggagg agctgaagag acaggctgag 
gacaagtgct acggccaaat aaaagaactc 
attgaagatg aaaagagaag aagaaaatct 
gactatgacc aactgcagaa agcaaggcaa 
ttagagtctg agaaagccat caaggagaag 
ctgcaggaag aaggcacccg gaagagagaa 
cactataatg aggagatgag taatttaagg 
aagaccacca tcaaggagat atccatgcaa 
cagcttgata gactttcaag ggaaaatcga 
gacagcatct tgcaggccac tgagcagcga 
aaggcctgtg gctctgagat aatgcagaag 
gtcatgcagc agcgctctga ggacaatgcc 
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cggcacaagc agtccctgga 
4260 



gagagactca aagctgagtt 
4320 



ctgagtaagg taagaaacaa 
4380 



accgagacca acatcaccaa 
4440 



accagtggct accgggctca 
4500 



gaaataaaga ggctgaagaa 
4560 



gaagacatcc aacagcaaaa 
4620 



gaggttgagc tgagacaagt 
4680 



tctcttgatg atgctgccaa 
4740 

caactgatcg aaaaagaaac 
4800 

caaagggtcc agtatgacct 
4860 



ctgaaggttc aggagcaaga 
4920 



gagaggactg tgaaggacca 
4980 



ctgcagaagc agaaggtgga 
5040 



tcctgcaaga ggaagaagct 
5100 



caagccatca aaatcaccaa 
5160 



aggagtgagg atgacctccg 
5220 



cagaggaccc aggaagagct 
5280 



ttactccagg aacaggaaag 
5340 



gcgatagaag ataaaagcag 
5400 



tctctcacag agaacctgac 
5460 



aggctggagt acgatgacct 
5520 



ggaggctgcc aagaccatcc 
tcaggaggag gccaagcgcc 
ttatgatgag gagatcatta 
gaccaccatc caccagctca 
gatagacaat ctcacccgag 
cactctaacc cagaccacag 
ggccactggc tctgaggtgt 
cactcagatg cgaacagagg 
aaccatccag gataaaaaca 
aaatgaccgg aaatgcctgg 
gcagaaagca aacagtagtg 
actgacacgc ctgaggatcg 
ggatatcacg cggttccaga 
agaggagctg aatcggctga 
ggaggaagag ctggaaggca 
cctgacccag cagctggagc 
gcagcagagg gacgtgctgg 
gaggaggctc tcttctgagg 
tgtcaaacaa gctcacttga 
aagcttaaat gaaagcaaaa 
caaggagcac ttgatgttag 
gaggagagga cgaagcgaag 
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aqgacaaaaa taaggagatc 
gctgggaata tgaaaatgaa 
gcttaaaaaa tcagtttgag 
ccatgcagaa ggaagaggat 
aaaacaggag cttatcCgaa 
agaatctcag gagggtggaa 
ctcagaggaa acagcagctg 
agagcgtaag atataagcaa 
aggagataga aaggttaaaa 
aagatgaaaa cgcgagatta 
cgacggagac aataaacaaa 
actatgaaag ggtttcccag 
aclctctgaa agagctgcag 
agaggaccgc gtcagaagac 
tgaggaggtc gctgaaggag 
aggcatccat tgttaagaag 
atggccacct gagggaaaag 
tcgaggccct gaggcggcag 
ggaatgagca tttccagaag 
tagaaattga gaggctgcag 
aagaagaact gcggaacctg 
cggacagtga taaaaatgca 
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accatcttgg aactaaggag ccagctgcag 
5580 

gggctgatta atgatttaca gagagagagg 

5640 

caaaagcagg ctttagaggc atctaatagg 
5700 

gtggtacagg aaagagagag ccttctggtg 

5760 

aggcLgcaga ggctggagga tgagctgaat 
5820 

agggtgaaac agcgcctgga gtgtgagaaa 
5880 

aagactcaat attcccgcaa ggaggaggct 
5940 

agtgagagag agaagaacag tcttaggagt 
6000 

agaattgaag agaggtgcag gcgtaagctg 
6060 

ttagaaacag aacgctcccg atatcagagg 
6120 

gggtcccatc gagagaccca gactgagtgt 
6180 

tttgatgggc tgaggaagaa ggtgacagca 
6240 

aaaacaacct tggacaaact attgaagggg 
6300 

atccagccat tccttcgggg tgcaggatct 
6360 

aaatactctt tggtagaggc caagagaaag 
6420 

cttctggagg cccaggcagc tacaggtggt 
6480 

actgtcgaca gtgccatagc tcgggacctc 
6540 

gcagcagaaa aagctatcac tggttttgat 
6600 

tcagaagcca tcaagaaaaa tttgattgat 
6660 

cagattgctt cagggggtgt agtagaccct 
6720 

gccttggccc gggggctgat tgatagagat 
6780 

agtcagaaaa actttgtgga tccagtcacc 
6840 



atcagcaaca accggacccc ggaactgcag 
gaaaattcga gacaggaaat tgagaaattc 
attcaggaat caaagaatca gtgtactcag 
aaaatcaaag tcctggagca agacaaggca 
cgtgcaaaat caactctaga ggcagaaacc 
cagcaaatcc agaatgacct gaatcagtgg 
attaggaaga tagaatcgga aagagaaaag 
gagatcgaaa gactccaagc agagatcaag 
gaggattcta ccagggagac acagtcacag 
gagattgata aactcagaca gcgcccatat 
gagtggaccg ttgacacctc caagctggtg 
atgcagctct atgagtgtca gctgatcgac 
aagaagtcag tggaagaagt tgcttctgaa 
atcgctggag catctgcttc tcctaaggaa 
aaattaatca gcccagaatc cacagccatg 
ataattgatc cccatcggaa tgagaagctg 
attgacttcg atgaccgtca gcagatatat 
gatccatttt caggcaagac agtatctgtt 
agagaaaccg gaatgcgcct gctggaagcc 
gtgaacagtg tctttttgcc aaaagatgtc 
ttgtatcgat ccctgaatga tccccgagat 
aaaaagaagg tcagttacgt gcagctgaag 
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gaacggtgca gaatcgaacc acatactggt 
6900 

atgtccttcc aaggaatcag acaacctgtg 

6960 



ttgagaccgt ccactgtcaa tgaactggaa 
7020 



gagagaatta aggacttcct ccagggttca 
7000 



acaaaacaga agcttggcat ttatgaggcc 
7140 



gctctggagL tgctggaagc ccaagcagct 

7200 



ttgaggttac cagtggagga agcctacaag 
7260 



aagctcctgt ctgcagaacg agctgtcact 

7320 



atctctttgt tccaagccat gaataaggaa 
7380 



ttagaagcac agatcgcaac cggggggatc 
7440 



gttgacatag catataagag gggctatttc 
7500 



ccaagtgatg ataccaaagg attttttgac 

7560 



caactaaaag aaagatgcat taaggatgag 
7620 



gaaaagaaga aacaggtgca gacatcacaa 
7680 



atagttgacc cagaaaccaa taaagaaatg 
7740 



attgattatg aaaccttcaa agaactgtgt 
7800 



atcacgggat cagatggctc caccagggtg 
7860 



tatgatattc aagatgctat tgacaagggc 

7920 



cgatccggca gcctcagcct cactcaattt 
7980 



ggcaccagca gcagcatggg cagtggtgtc 
8040 



gaatcagtaa gtaagatttc caccatatcc 
8100 



tctttttcag acaccctgga agaatcgagc 
8160 



ctgctcttgc tttcagtaca gaagagaagc 
accgccactg agctagtaga ttctggtata 
tctggtcaga tttcttatga cgaggttggt 
agctgcatag caggcatata caacgagacc 
atgaaaattg gcttagtccg acctggtact 
actggcttta tagtggatcc tgttagcaac 
agaggtctgg tgggcattga gttcaaagag 
gggtataatg atcctgaaac aggaaacatc 
ctcatcgaaa agggccacgg tattcgctta 
attgacccaa aggagagcca tcgtttacca 
aatgaggaac tcagtgagat tctctcagat 
cccaacactg aagaaaatct tacctatctg 
gaaacagggc tctgtcttct gcctctgaaa 
aagaataccc tcaggaagcg tagagtggtc 
tctgttcagg aggcctacaa gaagggccta 
gagcaggaat gtgaatggga agaaataacc 
gtcctggtag atagaaagac aggcagtcag 
cttgttgaca ggaagttctt tgatcagtac 
gctgacatga tctccttgaa aaatggtgtc 
agcgatgatg tttttagcag ctcccgacat 
agcqtcagga atttaaccat aaqgagcagc 
cccattgcag ccatctttga cacagaaaac 
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ctggagaaaa tctccattac 
3220 



cagaggcttc tggaggctca 
8230 



aagctgtcac ttcaggacgc 
8340 



gtgaagcctg ctcagaaagc 
8400 . 



tcagcagcag aggcagtgaa 
8460 



gagttccagt acctcacggg 
8520 



gaagaagcca tccggaaggg 
8580 



agcagctatg ccaaaatcct 
8640 



gccataaatc gctccatggt 
8700 



gtgtcgtcca agggcttacc 
8760 



ggctcccgct cgggatctcg 
8820 



ggaagctttg acgccacagg 
8880 



tctattgggc actagtagtc 
8940 



atttccactt tattaaataa 
9000 

cattctatgc ttacagaaaa 
9060 

ctttttatct tcttagctca 
9120 



tgctaatcag ttgtaacaat 
9180 



tttcgatttt tgatcaattc 
9240 



agataaaaat taaatggatc 
9300 



catattctgt attaggagaa 
9360 



cccaaaacca agcattttgg 
94 20 



catatactct tcgatgtact 
9480 



agaaggtata gagcggggca 
ggcctgcaca ggtggcatca 
agtctcccag ggtgtgattg 
cttcataggc ttcgagggtg 
agaaaaatgg ctcccgtatg 
aggtcttgtt gacccggaag 
gttcatagat ggccgcgccg 
gacctgcccc aaaaccaaat 
agaagatatc actgggctgc 
cagcccttac aacatgtctt 
ctccggatct cgctccgggt 
gaattcttcc tactcttatt 
agttgggagt ggttgctata 
tagaaaagaa aatcccggtg 
tatagccatg attgaaatca 
tcttaaataa gcagtacact 
agcacaaatc gaacttagga 
tttaattttg gaagcctata 
actgatattt tagtcattct 
aattaccctc ccagcaccag 
aatgagtctc cttLagtttc 
tgtttggttt ggtattaatt 
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tcgttgacag catcacgggt 
tccacccaac cacgggccag 
accaagacat ggccaccagc 
tgaagggaaa gaagaagacg 
aggctggcca gcgcttcctg 
tgcatgggag gataagcacc 
cacagaggct gcaagacacc 
taaaaatatc ctataaggat 
gccttctgga agccgcctcc 
cggctccggg gtcccgctcc 
cccgcagtgg gtcccggaga 
cctactcatt tagcagtagt 
ccttgacttc atttatatga 
cttgcagtag agtgatagga 
aatagtaaag gctgttctgg 
tggatgcagt gcgtctgaag 
tttgtttctt ctcttctgtg 
atacagtttt ctattcttgg 
gcttctcacc taaatatctc 
cccccctctc aaacccccaa 
agagtgtgga ttgtataacc 
tgactgtgca tgacagcggc 



EP 1 355 149 A2 



aatctt:ttct ttggtcaaag ttttctgttt attttgcttg tcatattcga tgtactttaa 
9540 



ggtgtcttta tgaagtttgc tattctggca ataaactttt agactttt 
9588 



<210> 


33 


<211> 


366 


<212> 


DNA 


<213> 


human 


<220> 




<221> 


misc feature 


<222> 


(351) (351) 


<223> 


any kind of base 


<400> 


33 



gaagtgccat ttatatttat acaaaaatat tacataattc agttagtatU ggtgacataa 
60 



tttagttagt atgggtgata taatggtcat aatttttagc atctaataaa gatcttttta 
120 



tgagtcccat ataaaatatg tgaacaaagc aatcttgtca taagatttgt gatgatttag 
180 



gagaaagtac tttgagataa tttttttctg tctctttgtg aactctctca acagtagttc 
240 



tctttagatt agagccagca ggtcggccat aacagttttc ttcaaattty ggcaacagtL 
300 



atacaaatgc ttgaatttca agacaacata ttaaagggtc tatgaactgg naatctaacc 
360 



tgggtt 
366 



<210> 34 

<211> 1466 

<212> DNA 

<213> human 

<400> 34 

agccccaagc ttaccacctg cacccggaga gctgtgtgtc accatgtggg tcccggttgt 
60 

cttcctcacc ctgtccgtga cgtggattgg tgctgcaccc ctcatcctgt ctcggattgt 
120 



gggaggctgg gagtgcgaga agcattccca accctggcag gtgcttgtgg cctctcgtgg 
180 



cagggcagtc tgcggcggtg ttctggtgca cccccagtgg gtcctcacag ctgcccactg 
240 



catcaggaac aaaagcgtga tcttgctggg tcggcacagc ctgtttcatc ctgaagacac 
300 

aggccaggta tttcaggtca gccacagctt cccacacccg ctctacgata tgagcctcct 
360 
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gaagatttcga ttcctcai^gc cagglgaLga ctucagccac gacctcaL^jC Lgctccgcct 
420 



gtcagagcct gccgagctca cggat.gctgt gariggtcatg gacctgccr.a cccaggagcc 
480 



agcactgggg accacctgct acgccCcagg ctggggcagc attgaaccag aggagttctt 
540 



gaccccaaag aaacttcagt gtgtggacct ccatgttatt tccaatgacg tgtgtgcgca 
600 



agttcaccct cagaaggtga ccaagttcat gctgtgtgct ggacgctgga cagggggcaa 
660 



aagcacctgc tcgggtgatt ctgggggccc acttgtctgt aatggtgtgc ttcaaggtat 
720 



cacgtcatgg ggcagtgaac catgtgccct gcccgaaagg ccttccctgt acaccaaggt 
780 



ggtgcattac cggaagtgga tcaaggacac catcgtggcc aacccctgag cacccctatc 
840 



aaccccctat tgtagtaaac ttggaacctt ggaaatgacc aggccaagac tcaagcctcc 
900 



ccagttctac tgacctttgt ccttaggtgt gaggtccagg gttgctagga aaagaaatca 
960 



gcagacacag gtgtagacca gagtgtttct taaatggtgt aattttgtcc tctctgtgtc 

1020 



ctggggaata ctggccatgc ctggagacat atcactcaat ttctctgagg acacagatag 
1080 



gatggggtgt ctgtgttatt tgtggggtac agagatgaaa gaggggtggg atccacactg 
1140 



agagagtgga gagtgacatg tgctggacac tgtccatgaa gcactgagca gaagctggag 
1200 



gcacaacgca ccagacactc acagcaagga tggagctgaa aacataaccc actctgtcct 

1260 



ggaggcactg ggaagcctag agaaggctgt gagccaagga gggagggtct tcctttggca 
1320 



tgggatgggg atgaagtaag gagagggact ggaccccctg gaagctgatt cactafcgggg 
1380 



ggaggtgtat tgaagtcctc cagacaaccc tcagatttga tgatttccta gtagaactca 
1440 



cagaaataaa gagctgttat actgtg 
1466 



<210> 35 

<211> 187 

<212> DNA 

<213> human 



<400> 35 

gatctggtgc attccggtcg acactctcgt ttacttggac tglaagtctg acctctatga 
60 
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ataattactt cagcccctga gtgctcccgg gccaagctcc ttggccaaac tttcacctta 
120 

gcttctgata agtcttgggc caagctaagc agcaUctatc adtcatccct tcagctcctg 

180 

attgatc 
187 

<210> 36 

<211> 2913 

<212> DNA 

<213> human 

15 <400> 36 

actgggtacc gaggactggg tgtgtttaaq gcagacagcc aggtgaggat cccagctact 
60 



10 



20 



25 



30 



35 



40 



45 



50 



55 



ggggcctgct gtcatctcct gggagtaccc gggggtcagg agcctagggg actcttgcac 
120 

ttcacatcca gccatgctaa ttacactttt tggcaaaggo aacagctagg agcagtttct 
180 

ttcactccta cagccccgtt ttctcagtgt ttagacctcg aattattact gggctagagg 

240 

gaaggcagcc tctgaagtgt gqcaggagga ggggaagtct gcctgcatct tggtgtgtct 
300 

gtcagatgcc agcactaata acctggcttc tgtgaggcct gtcagtgctc tcaggaatga 
360 

aaggggaccc ctgagaggtg ctcagtacca gcaggctgtg aatgctctct acccaccacc 
420 

ctcacctcct cgttaaagat ggtgctacct gccacacagc agacatctgg Lcgctgcaca 
480 

cccgaaagac cccaaggcag tctgcccctt gtccagccac acgccagcac ccaccctcct 
540 

ggcccctgcc tcggcctccc cagaccagct gcacccagcc cccaacacgc accccttctc 
600 

cagatgtgtg cagggcctca ttttgcagag caaagacaga tgtttcagcc acacgcttta 
660 

ttaacttcta aaacctgtgc tcaggacact cttcaacagt catgaaaagt ttgatcactt 
720 

gccacagtca ggacctttgt gtggggctct gatctgatgt tcggLctcat catctcccaa 
780 

accagcagtc gtttgtaccc caaccctctg ctcaggggct cataccccca aatgattttc 
840 

ctgatttatg tatttcccta caaagggctt tctataccta gcatctgcct ccagcatgag 
900 

aagggggaat aggtgagacc catttgccag tagcagacgg ggaccctggg gagaaaatgg 

960 

cagagcctgt tggagactcc ctgtctccag ctgaccagcc aatgggattc ctcttccctc 



88 



1020 

cactgtctcc cacaaagtag 
1080 

ccctatctag acatgaggcc 
1140 



gggtcgggga ggcagagggg 
^20C^ 



tcacctgggg agagttcaca 
1260 

ggtgggtctg gggtgtggcc 
1320 

gcagcccagc taagccccta 
1380 

tttgagggaa atgcctaact 
1440 



agataccagg cccaccccca 
1500 

agtgtttttc acaagctcca 
1560 

cattctgaga gtcctgttgc 
1620 

gccttgagga ggatgtgcat 
1680 

tgtcccaccg gggacctcca 
1740 

agaggatgag gctggcggat 
1800 

agttgccctg tcatctggcc 
1B60 



gatggcgaaa gacggccaca 
1920 

acagtggccc ctggtgcatg 
1980 

tctgaaatta gggagatatg 
2040 

ttgcttgggg gtccttggtc 
2100 



gtcccatttt gctgtactgc 
2160 



tcccctagta gggtttgttt 
2220 

cagcctctac atcagggaga 
2280 



ctaaagtaca acagccttgg 
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aagaatcctg gtacatttag 
ctttagacat gactttggca 
atgctcacac cagtaattct 
aaatactggt gcaggggtcc 
caggcatcat gatgtttcag 
gagccttgca atttccccca 
tcaggggccg taagaatccc 
gacjatgagct gaggtgggtc 
tacctccagg aaatggtgtt 
ctgtgccttg gtgcacgtgg 
taacgtggta ggggagacag 
aaaacttcat ggatgttaga 
ttagtaagag ccctccgtgt 
tctggataac ccacctctcc 
tttagtgaga cccctaaggt 
gaccacacac tctcttccct 
aatgtctttc ttgaaaactt 
aaggccagct ttggactagg 
aaactcaggc ttggttccaa 
tggggtcaca tctggtcata 
gaggtaggta gggaggagca 
aggaactgcc aggaactaag 



cccatgagcc tggcacagat 
ttgaccagcc tgttggcaat 
catcccr.tga atgcttggga 
cacctctgat gatgctgagt 
gcccccaggt gacttctcag 
aatgacctca gagggcccga 
ccagggagca tgtgaaatgc 
oggggtgaag tgcagggatc 
gtggttgggc ccgtagaaaa 
ggtggaatcc cagtggccct 
agacagctcc acctgccccc 
gcaagcagcc atgctgcagc 
ttgggctgag ttctttctct 
tccctcatcc taaaattaca 
cctccaacta gggtgggtcc 
cctctggctc aggactacgg 
ctcttcccag tcttcccact 
gcttgttgcg actaccagct 
gcttatgggg gccctgtcct 
cccttcagag agctcttccc 
ttcaaggatt agaagaagga 
ggcgagcact ggagaaggca 
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2340 



acctgggacc ccctgcgctt ctgagcagga agaccaagac cttcaggggc cctaagcact 
2400 



gaaaaccitca ttcctcatcc ccaagccctg gcatccccct gttcttctaa aataattctt 
2460 



ttctaggtat ttctgattgc aaaattctgg atgggttcat ccaagctgac ctttgctgtt 

2520 



ttttcccttc ccaacaaggc ctcacttttt ggagccacct tagttggtgc ctaggcagag 
2580 



gggcagtcag cagtggLtat caggatcctg gctctatggg ttgccttcct cctggtctgt 

2640 



aaagcccctg caggcaggga cttcttagatr. agctgcttcc ttagggcatg gcatgtggtg 
2700 



ggtggttaat gaatggaaga gagggaatga gtgatcaagg gagggaggag ggagtggagt 

2760 



ggagatttct catcctttcc tgttaattta Ugacatcctc ctgcctatga gtccttgact 
2820 



ctggagtttt acaaagcagt cacatttcaa ataaaagtct gggaaagcaa cacatcatcg 
2880 



ccaactttta attttgctaa ataaggatat tag 
2913 



<210> 37 

<211> 14G5 

<212> DNA 

<213> human 

<400> 37 

agccccaagc ttaccacctg cacccggaga gctgtgtgtc accatgtggg tcccggttgt 
60 

cttcctcacc ctgtccgtga cgtggattgg tgctgcaccc ctcatcctgt ctcggattgt 
120 



gggaggctgg gagtgcgaga agcattccca accctggcag gtgcttgtgg cctctcgtgg 
180 



cagggcagtc tgcggcggtg ttctggtgca cccccagtgg gtcctcacag ctgcccactg 
240 



catcaggaac aaaagcgtga tcttgctggg tcggcacagc ctgtttcatc ctgaagacac 
300 



aggccaggta tttcaggtca gccacagctt cccacacccg ctctacgata tgagcctcct 
360 



gaagaatcga ttcctcaggc caggtgatga ctccagccac gacr.tcatgc tgctccgcct 
420 



gtcagagcct gccgagctca cggatgctgt gaaggtcatg gacctgccca cccaggagcc 
480 



agcactgggg accacctgct acgcctcagg ctggggcagc attgaaccag aggagttctt 
540 
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gaccccaaag aaacttcayt gtgtggacct ccatgttatt tccaatgacg tgtgtgcgca 
600 

agttcaccct cagaaggtga ccaagttcat gctgtgtgct ggacgctgga cagggggcaa 
660 

aagcacctgc tcgggtgatt ctgggggccc acttgtctgt aatggtgtgc ttcaaggtat 
720 

cacgtcatgg ggcagtgaac catgtgccct gcccgaaagg ccttccctgt acaccaaggt 
780 

ggtgcattac cggaagtgga tcaaggacac catcglggcc aacccctgag cacccctatc 
84U 

aaccccctat tgtagtaaac ttggaacctt ggaaatgacc aggccaagac tcaagcctcc 
900 

ccagttctac tgacctttgt ccttaggtgt gaggtccagg gttgctagga aaagaaatca 
960 

gcagacacag gtgtagacca gagtgtttct taaatggtgt aattttgtcc tctctgtgtc 
1020 

ctggggaata ctggccatgc ctggagacat atcactcaat ttctctgagg acacagatag 
1080 

gatggggtgt ctgtgttatt tgtggggtac agagatgaaa gaggggtggg atccacactg 
1140 

agagagtgga gagtgacatg tgctggacac tgtccatgaa gcactgagca gaagctggag 
1200 

gcacaacgca ccagacactc acagcaagga tggagctgaa aacataaccc actctgtcct 
1260 

ggaggcactg ggaagcctag agaaggctgt gagccaagga gggagggtct tcctttggca 
1320 

tgggatgggg atgaagtaag gagagggact ggaccccctg gaagctgatt cactalgggg 
1380 

ggaggtgtat tgaagtcctc cagacaaccc tcagatttga tgatttccta gtagaactca 
1440 

cagaaataaa gagctgttat actgtg 
1466 



<210> 38 

<211> 462 

<212> DNA 

<213> human 

<220> 

<221> misc^feature 

<222> {197) . . (197) 

<223> any kind of base 



<220> 

<221> inisc_f eature 

<222> (116) . . (116) 

<223> any kind of base 
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<220> 

<221> misc_feature 

<222> (334) . . (334) 

<223> any kind of base 



<22C> 

<221> misc_feature 

<222> (402) (402) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (429) .. (429) 

<223> any kind of base 



<220> 

<221> misc feature 

<222> (438) . . (430) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (443) . . (443) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (459) . . (459) 

<223> any kind of base 



<400> 38 

taaggtttta taattatttt tatttttctt 
60 

ttattttcag atccaatact agaagttgtt 
120 

aaaaaagagt tgtatttttt ttttttgctt 
180 

catgtgcaca acgcagnggt tagctacata 
240 

tccagtaact cgtcatttaa cattaggLat 
300 

atttttcata gcttaaaaat cattgacata 
360 

atccctgggg gaataaattt tgtcttaaca 
420 

tcacagggna aaaggganat ccncccattt 
462 



<210> 39 

<211> 1490 

<212> DNA 

<213> human 



ttcttttttt tttatggctt ggatgacact 
tccatgttca cattttcctt cctggnttaa 
tttttaaatt atactttaag ttttagggta 
tgtatacatg tgccatgttg gcgtgctgca 
atctccaaat gctatccttc cccccattgt 
ggantaattc caactaaagt acggtattaa 
agggtaaggt tngtgaaaag gatggttttg 
taaaacccnc ct 
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<400> 39 

ctcgrgcccc ccacggaggg gactgctctc ccccgctgca tcctttctgt gaggtocctt 
60 

acccacctca ycacctgaga gggtgaaata gaattctaac ctcgacattc gggaagtgtt 
120 

tttgagaagt ctcggtcggt aagggaagtc ttccaagtcc gtgcagcact aacgtattgg 
180 

cacctgcctc ctcttcggcc accccccaga tgaggcagct gtgactgtgt caagggaagc 
240 

cacgactctg accatagtct tctctcagct tccactgccg tctccacagg aaacccagaa 
300 

gttctgtgaa caagtccatg ctgccatcaa ggcatttatt gcagtgtact atttgcttcc 
360 

aaaggatcag gggatcaccc tgagaaagct ggtacggggc gccaccctgg acatcgtgga 
420 

tggcatggct cagctcatgg aagtactttc cgtcactcca actcagagcc ctgagaacaa 
480 

tgaccttatt tcctacaaca gtgtctgggt tgcgtgccag cagatgcctc agataccaag 
540 

agaUaacaaa gctgcagctc ttttgatgct gaccaagaat gtggattttg tgaaggatgc 
600 

acatgaagaa atggagcagg ctgtggaaga atgtgaccct tactctggcc tcttgaatga 
660 

tactgaggag aacaactctg acaaccacaa tcatgaggat gatgtgttgg ggtttcccag 
720 

caatcaggac ttgtattggt cagaggacga tcaagagctc ataatcccat gccttgcgct 
780 

ggtgagagca tccaaagcct gcctgaagaa aattcggatg ttagtggcag agaatgggaa 
840 

gaaggatcag gtggcacagc tggatgacat tgtggatatt tctgatgaaa tcagccctag 
900 

tgtggatgat ttggctctga gcatatatcc acctatgtgt cacctgaccg tgcgaatcaa 
960 

ttctgcgaaa cttgtatctg ttttaaagaa ggcacttgaa attacaaaag caagtcatgt 
1020 

gacccctcag ccagaagata gttggatccc tttacttatt aatgccattg atcattgcat 
1080 

gaatagaatc aaggagctca ctcagagtga acttgaatta tgacttttca ggctcatLtg 
1140 

tactctcttc ccctctcatc gtcatggtca ggctctgata cctgctttta aaatggagct 
1200 

agaatgcttg ctggattgaa agggagtgcc Uatctatatt tagcaagaga cactattacc 
1260 

aaagattgLL ggttaggcca gattgacacc tatttataaa ccatatgcgt atatttttct 
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1320 



gtgctatata tgaaaaataa ttgcatgatt tctcattcct qagtcatttc tcagagattc 
1380 



ctaggaaagc tgccttatLc tcLLtttgca gtaaagtatg ttgttLtcat tgtaaagatg 
1440 



ttgatggtct caataaaatg ctaacttgcc agtgattaaa aaaaaaaaaa 
1490 



<210> 40 

<2H> 1677 

<212> DNA 

<213> human 

<400> 40 

cttgacccta tbtatagtgg ctctaaaggt ggtgttatta tgttttctag agcacttcga 
60 

ttatacaaac gtcaaggaat ccgagttaat gtgctttgcc ctgagtttgt tgaaacagac 
120 



atgggcacaa tgatcggtcc caaattcctt agtatgatgg ggggctttgt acctatggaa 
180 



atggtggtga aaggtgcttt tgagctcatc actgatgaga ataaagccgg cgattgccta 
240 



tggattacta atcggcgagg tcttgagtac tggcccaccc catcagaaga agcaaagtac 
300 



ttgctgcgtt ctacacgttc caggagaaga actgaataca aagctccacc aattaaacta 
360 



cctgagagtt ttgagaaaat agttgttcag accttgactc acaactttcg gaatgctacc 
420 

agtgtagtaa gagcaccact gagattacct atcaaaccaa actatgttct tgtgaagata 
480 

atctatgctg gtgtaaatgc tagtgatgta aattctagct caggtcgcta ttttggtggc 
540 

aataacagtg acactgcatc ccgtcttccg tttgatgcag gatttgaggc tgtgggagta 
600 

attgcagcag ttggggattc tgttactgac ttgaaagttg gcatgccttg tgcgttcatg 
660 

acttttggag gctatgctga atttacaatg attccttcga aatacgccct tccaatgcct 
720 



agaccagaac cggaaggtgt tgccatgctt acatcaggat taacagcttc aattgctcta 
780 



gaaaaggcag gacagatgga atctggaaaa gtggtccttg ttactgctgc ggcaggagga 
840 



actggtcagt ttgctgttca gcttgcaaaa ttagctggta ataccgtggt tgccacttgt 
900 



ggaggtgggg caaaggccaa gctCctgaaa gaattgggag tcgacagagt ca tagactat 
960 
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cacagtgaag atataaaaac 
1020 

Lacyaatctg ttggtgggga 
1080 

cgactcattg tcattggcat 
1140 

aaatatcctg gactatgtga 
1200 

cUggtgcaat atagtcacat 
1260 

tccggaaaac taaaggttgc 
1320 

gatgctgttg agtatctcca 
1380 

ccgaccttcg gtcatcaagt 
1440 

agtgaagttt tcaattctta 
1500 

agaccagtgc tggaatattt 
1560 

aatccattta tgtataccat 
1620 

gcgagatatc tacaaaataa 
1677 



<210> 41 

<211> 1330 

<212> DNA 

<213> human 

<400> 41 

atggcgcagt gggacagctt 
60 

gtgaagtttg atgctcgctc 
120 

ccacttcaag aaaagctgaa 
180 

tttgctgttc tcattcctat 
240 

aagaattgca cagttggttc 
300 



ggaaatgaca gtgaagatqa 
360 

atggagaaaa gaatccaata 
420 

tttcaaaatt tcagtgtgac 
480 



ggttctaagg aaagagttcc 
catgttoaog ttgtgcttgg 
gatttctcag Latcaaggag 
gaagctcttg tcaaagagtc 
gtaccaagaa caccttaaca 
tgtggatcca aagagattta 
ttcaggcaaa agcgttggga 
agccaaatta tgaatgaaca 
gtctagagat tgttctcgaa 
attctcaatg ctttttcaat 
gtttatgUtt acactataca 
attataatcc tttcatttta 

cactgatcaa caggaggaca 
caatacagct ttgcttcccc 
atccttcaaa gctgcactga 
catcgcaata atggcagctc 
aattaatgca aacagtgtat 
agtgagattt cgagaagttg 
Uatttcagat actgaagaaa 
aactgatcaa cgatttgctg 



cgaaaggtaL tgatatcacc 
atgctttggc agtccatgga 
aaaatggttg gacgccatca 
aaactgtggc tggctttttc 
agttatttga cctttactct 
taggccttca ttctgttgct 
aggtggttgt ctgcgtggac 
cggtgtcaaa tacagaaaga 
tgttactgaa aatagctgct 
tttggattac ttgaaagaat 
acaactatga gcagaagaaa 
aaaaaaaaaa aaaaaaa 

ctgatagctg ttcagaatct 

caaatcctaa aaatggccct 

ttgcccttta tctccttgtg 

aactcctgaa gtgggaaatg 

cctccagtct cctgggaaga 

ttatggaaca cattagcaag 

atctcgtaga ttcagagcat 

atgttcttct ccaactaagt 
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accttggttc ccacagtcca gggacatggg aatgccgtag atgaaatcac caggtcctta 
540 

ataagtctga ataccacgct gcttgatttg cacctctatg tagaaacact gaatgtcaaa 

600 

ttccaggaga atacacttaa agggcaagag gaaatcagca aattaaagga gcgtgtgcac 
660 

aatgcatcag cagaaattat gtctatgaaa gaagaacaag tgcatttgga acaggaaata 
720 

aaaagagaag tgaaagtcct gaataacatc actaatgatc tcaggctgaa agattgggaa 
780 

cattctcaga cgttgagaaa tatcacttta attcaaggtc ctcctggacc cccaggagaa 

840 

aaaggagata gaggtccaac cggagaaagt ggtccaccag gcgttccagg tccagtaggt 
900 

cctccaggtc ttaagggtga tcgaggatct attggctttc cgggaagtcg aggatatcca 
960 

ggacaatcag ggaagactgg gaggacagga tatcctggac caaaaggcca aaaggqaqaa 
1020 

aaaggcagtg gaagcatcct gactccttct gcgactgtcc gactggttgg Lggccgtggc 
1080 

cctcatgagg gtagagtgga gataLtgcac aatggacagt ggggcacagt ttgtgatgat 
1140 

cactgggaac tgcgtgccgg gcaggttgtc tgcaggagct tgggataccg aggtgttaag 
1200 

agtgtgcaca agaaagctta ttttggacaa ggtactggtc ccatttggct qaatqaagta 
1260 

ccctgtttgg ggatggagtc atccattgaa gagtgcaaaa tcagacagtg gggcgtgaga 
1320 

gtctgttcac 
1330 



<210> 42 

<211> 431 

<212> DNA 

<2I3> human 

<220> 

<221> miscfeature 

<222> (97).. (97) 

<223> any kind of base 



<220> 

<2 2 1 > mi sc_f ea ture 

<22i> (347) . . (347) 

<223> any kind of base 



<220> 

<221> misc feature 
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<222> (349) (349) 

<223> any )cind of base 

5 <220> 

<221> jnioc_£caturc 

<222> (361) . . (361) 

<223> any Jtind of base 

10 <220> 

<221> misc_feature 

<222> (362) (362) 

<223> any Jcind of base 

15 <220> 

<221> misc_feature 

<222> (363).. (363) 

<223> any kind of base 

20 <220> 

<221> misc_feature 

<222> (401) . . (401) 

<223> any kind of base 

25 <220> 

<221> misc_feature 

<222> (428) . . (428) 

<22 3> any kind of base 

30 <400> 42 

ctttttatat ttattttcat cgctacacaa acatttttta ggagtttgat tctacctcca 
60 

ttttggttag atatacaaac tctaccccat gagggantgt atggtgtatt tctagattta 
120 



35 



40 



45 



50 



55 



gcaacaattt tcttgaaaaa tgtacaatac tatagaaaaa tgaagatagt aaataccagg 
180 

tataagttaa taacagtg.tt tcttttgttc agtaataatg aactgtgtac tagcactgaa 

240 

ctttaggccc tcctatttgc gtattttctg tttgtatatt tctaaataga ggaattgtga 
300 

ttataatatt attattttgg aatatcctaa atcataaatt caaaacntna tttagttttt 

360 

nnnttttttt tttaagatgg agtcccgctt tgtcccaggc nggagtgcag tggcatgatc 
420 

tcagctcnct g 
431 



<210> 43 

<211> 669 

<212> DNA 

<213> human 

<220> 
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<221> misc_£eature 
<227> (641) . . (641) 
<223> any kind of base 



<400> 43 

ttcttttgga aaaccaaaca tgctttattt catttttttc acaatttatt taaacatctc 
60 

acatatacaa aataggtaca atttaatttt tctgcttgcc caagaaacaa agcttctgtg 
120 



gaaccatgga agaagatgaa aatgagactg gcaaagaaca aatgctgaat ctgaagaaga 
180 



ggacaacttt gggcaaataa tctgcatact tttaattggg aataagatgg aaaatatgaa 
240 



tgctaaatca aattttttaa aaaatacacc acacgataca actcaataca ggagtotttc 
300 



ttctcaaatt cttctagcac catcaacatt cttcaagtat ctgaaatact attaattagc 
360 

acctttgtat tatgaacaaa acaaaacaag gacctcagtt catctctgtc taggtcagca 
420 



cctaacaatg tggatcacac tcatgggaaa gtgttttgag gtagtttaaa cctttggaag 
480 



tttgggtttt aaacttccct ctgtggaaga tattcaaaag ccacaagtgg tgcaaatgtt 
540 



tatggttttt atttttcaat ttttattttg gctttcttac aaaggttgac atttttcata 
600 



acaggtgtaa gagtgttgaa aaaaaaattt caatttttgg ngggaacggg ggaaggagtt 
660 



aatgaaact 
669 



<210> 44 

<211> 287 

<212> DNA 

<213> human 

<400> 44 

gccggagagt ctacaatgtt acccagcatg ctgttggcat tgttgtaaac aaacaagtta 
60 

agggcaagat tcttgccaag agaattaatg tgcgtattga gcacattaag cactctaaga 
120 



gccgagatag cttcctgaaa cgtgtgaagg aaaatgatca gaaaaagaaa gaagccaaag 
180 



agaaaggtac ctgggttcaa ctaaagcgcc acgctgctcc acccagagaa gcacactttg 
240 



tqagaaccaa tgggaaggag cctgagctgc tggaacctat tccctat 
287 



<210> 45 
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<211> 383 

<212> DNA 

<213> human 

<220> 

<221> misc_feature 

<222> (147) . . (147) 

<223> any kind of base 



<220> 

<221> misc feature 

<222> (309) . . (309) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (349) . . (349) 

<223> any kind of base 



<220> 

<221> misc feature 

<222> (365T. . (365) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (372) . . (372) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (380) . . (380) 

<223> any kind of base 



<400> 45 

ggaacggaaa aggagaattc aagtgtgacc ctcatgaggc aacgtgttat gatgatggga 
60 

agacatacca cgtaggagaa cagtggcaga aggaatatct cggtgccatt tgctcctgca 
120 

catgctttgg aggccagcgg ggctcgnctt gtgacaactg ccgcagacct ggggggtgaa 
180 

cccagtcccg aaggcactac tggccagtcc tacaaccagt attcttcaga gataccattc 

240 

agagaacaaa cactaatgtt taatttgccc aatttgagtg cttcatgcct tttaggatgt 
300 

tacaggctng acagagaagg ttttcccgag gagttaaatc atctttttnc catttcccga 
360 

ggggncaagg cntgtttttn ttt 
383 



<210> 46 
<211> 523 
<212> DNA 
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<213> human 
<400> 46 

caqaqqgqca gggcgqacgg 
60 

cagctgcacc gccatgaata 
120 

ccagggctac aaccttcgag 
180 

ggatttgaag aagctgaagg 
240 

ctatgaatac agaacacctt 
300 

tttcctaatt gagcaacaat 
360 

gattaaggca gtacagtgtc 
420 

agacccagat ctgagggata 
480 

aagtttgtca ttagttgaaa 
523 



<210> 47 

<211> 390 

<212> DNA 

<213> human 

<400> 47 

tccaaggtca tggcaaaaca 
60 

aacgtggaaa gcgcatacag 
120 

gacattaagc atcggcggta 
180 

aaaggtgccg gcggatctac 
240 



agaatcgggc agatccgtgg 
300 

ccctcatcca gttttcUctc 
360 

cLctgcaata aactcaaatc 



390 




<210> 


48 


<211> 


669 


<212> 


DNA 


<213> 


human 


<220> 




<221> 


misc 



ctaggagttc aagaaacatc 
agcttttcag cttctggaag 
aaaaggattt aaagaaactt 
aaLaccttca gatcaagaaa 
tgcacctagc ctgtgctaat 
gcaagataaa tgtccgggat 
aaaatgagga ttgtgctact 
ttcgttataa tactgtfcctt 
aactgcttga atacgaagct 

tctgaagttc atcgccagga 
gaccctaaac agaatcctca 
ttatgagaag ccatgccgcc 
aacatggaaa tggctcgcaa 
cagggctgct gaggcctgtg 
catctctttt ctttgtacaa 
acatgtctgc 



ctggtctgag ggaaaggctg 
agqaagaatg agacccgcag 
cacagagctg cttcagtcgg 
tatgatgtaa atatgcagga 
ggacatacag atgttgtact 
agtgaaaaca aatccccatt 
attctgctaa actttggtgc 
cactatgctg tttgtggtca 
gat 

ctgtgatggt acaggaaggg 
ctatggatgg gctcattgag 
gcgacagagg gaaagctatg 
gatcaacttc ttgatgcgaa 
ggtgggacac cagtgcgaaa 
tcccatttcc tattaccatt 
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<222> (641) . . (641) 
<223> any kind of base 



<400> 48 

ttcttttgga aaaccaaaca tgctttattt 
60 

acatatacaa aataggtaca atttaatttt 

120 

gaaccatgga agaagatgaa aatgagactg 
180 

ggacaacttt gggcaaataa tctgcatact 
240 

tgctaaatca aattttttaa aaaatacacc 
300 

ttctcaaatt cttctagcac catcaacatt 

360 

acctttgtat tatgaacaaa acaaaacaag 
420 

cctaacaatg tggatcacac tcatgggaaa 
480 

tttgggtttt aaacttccct ctgtggaaga 
540 

tatggttttt atttttcaat ttttattttg 
600 

acaggtgtaa gagtgttgaa aaaaaaattt 
660 

aatgaaact 
669 



<210> 49 

<211> 431 

<212> DNA 

<213> human 

<220> 

<221> inisc_feature 

<222> (97).. (97) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (347) (347) 

<223> any kind of base 



<220> 

<221> misc_feature 

<222> (349) (349) 

<223> any kind of base 



<220> 



catttttttc acaaLttr:»t:t traaacatctc 
tctgcttgcc caagaaacaa agcttctgtg 
gcaaagaaca aatgctgaat ctgaagaaga 
tttaattggg aataagatgg aaaatatgaa 
acacgataca actcaataca ggagtatttc 
cttcaagtat ctgaaatact attaattagc 
gacctcagtt catctctgtc taggtcagca 
gtgttttgag gtagtttaaa cctttggaag 
tattcaaaag ccacaagtgg tgcaaatgtt 
gttttcttac aaaggttgac atttttcata 
caatttttgg ngggaacggg ggaaggagtt 
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<221> miscfeature 
<222> (361) . . (361) 
<223> any kind of base 



<220> 

<221> misc_feature 

<222> (362) . . (362) 

<223> any kind of base 



10 



15 



<220> 
<221> 
<222> 
<223> 



misc_feature 
(363) . . (363) 
any kind of base 



misc_feature 
(401) (401) 
any kind of base 



misc___f eature 
(428) . . (428) 
any kind of base 



49 

ctttttatat ttattttcat cgctacacaa acatttttta ggagtttgat tctacctcca 



ttttggttag atatacaaac tctaccccat gagggantgt atggtgtatt tctagattta 

120 

gcaacaattt tcttgaaaaa tgtacaatac tatagaaaaa tgaagatagt aaataccagg 
35 180 

tataagttaa taacagtgtt tcttttgttc agtaataatg aactgtgtac tagcactgaa 
240 

ctttaggccc tcctatttgc gtattttctg tttgtatatt tttaaataga ggaattgtga 
300 

ttataatatt attattttgg aatatcctaa atcataaatt caaaacntna tttagttttt 
360 . 

45 nnnttttttt tttaagatgg agtcccgctt tgtcccaggc nggagtgcag tggcatgatc 

420 , : . • 

tcagctcnct g 
431 



<220> 
<221> 
<222> 
20 <223> 



<220> 
<221> 
<222> 
^5 <223> 



<400> 



Claims 

55 

1 . A method of assessing colorectal cancer status comprising Identifying differential modulation of each gene (relative 
to the expression of the same genes in a nomnal population) in a combination of genes selected from the group 
consisting of Seq. ID. No. 42-49. 
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2. The method of claim 1 wherein there Is at least a 2 fold difference in the expression of the modulated genes. 

3. The method of claim 1 wherein the p-value indicating differential modulation is less than .05. 

4. The method of claim 1 further comprising employing a colorectal diagnostic that is not genetically based. 

5. The method of claim wherein the cancer marker that is not genetically based is selected from the group consisting 
of carcinomebryonic antigen, CA1 9-9. CA 125. CK-BB, and Guanylyl Cyclase C. 

6. A diagnostic portfolio comprising isolated nucleic acid sequences, their complements, or portions thereof of a 
combination of genes selected from the group consisting of Seq. ID. No. 42-49. 

7. The diagnostic portfolio of claim 6 in a matrix suitable for identifying the differential expression of the genes con- 
tained therein. 

8. The diagnostic portfolio of claim 7 wherein said matrix is employed in a microarray. 

9. The diagnostic portfolio of claim 8 wherein said microarray is a cDNA microarray. 

10. The diagnostic portfolio of claim 8 wherein said microarray is an oligonucleotide microanray 

11. A diagnostic portfolio comprising isolated nucleic acid sequences, their complements, or portions thereof of a 
combination of genes selected from the group consisting of Seq. ID. No. 42-49. 

12. A kit for diagnosing colorectal cancer comprising isolated nucleic acid sequences, their compliments, or portions 
thereof of a combination of genes selected from the group consisting of Seq. ID. No. 42-49. 

13. The kit of claim 12 further comprising reagents for conducting a microarray analysis. 

14. The kit of claim 12 further comprising a medium through which said nucleic acid sequences, their complements, 
or portions thereof are assayed. 

15. A method of assessing response to treatment for colorectal cancer comprising Identifying differential modulation 
of each gene (relative to the expression of the same genes in a normal population) in a combination of genes 
selected from the group consisting of Seq. ID. No. 42-49. 

16. The method of claim 15 wherein the assessment of the response to therapy includes a determination of whether 
the patient is improving, not Improving, relapsing, likely to improve, or likely to relapse. 

17. Articles for assessing colorectal cancer statuts comprising isolated nucleic acid sequences, their complements, 
or portions thereof of a combination of genes selected from the group consisting of Seq. ID. No. 42-49. 

18. Articles for assessing colorectal cancer status comprising representations of isolated nucleic acid sequences, their 
complements, or portions thereof of a combination of genes selected from the group consisting of Seq. ID. No. 
42-49. 
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